The 'Historical Jewish Press' website has taken upon itself the task of uploading to the network Jewish newspapers published around the world from the eighteenth century onwards. The enormously comprehensive scope of this project begs the question: what is a Jewish newspaper? Is it a newspaper intended for a Jewish readership? Is it a newspaper that deals with "Jewish" topics and concerns (with the further difficulty of defining precisely what those topics and concerns are)? To what extent should local and regional newspapers be included in this category? To what extent should newspapers which were not originally intended for consumption by the general public be included
The answer to this question—and to additional questions which might arise in the context of defining a newspaper as Jewish—is that the main criterion for thus classifying a newspaper is the fact that its writing and publication is the work of Jewish writers and editors. Naturally, when both the writers and editors are Jews, community issues will find their way into the pages of the newspapers, although it is important to note that the content of these newspapers included a broad variety of subjects influenced by the age group and target audience for whom the newspaper was originally intended. Daily newspapers, local newspapers, regional newspapers, party newspapers, children's newspapers, youth newspapers, institutional newspapers—all of these and more are included under the wide umbrella of our definition of "Jewish newspaper," and the deciding factor for their inclusion is that they were written and edited by Jews. This wide umbrella allows us to offer both the general public and the research community a broad variety of contents befitting the preference and subjects of interest of every individual, without imposing limits on the scope of the search.
It is possible to classify the content accessible through this website in several different ways. Below is listed one such possible division:
Knowledge and articles from around the world: While the newspapers on this website all bear a Jewish stamp, the majority of them saw themselves engaged in universal subjects and felt an obligation to provide their readers with general knowledge. The website therefore constitutes an important database of general, historical journalism. In this area, the daily newspapers are a particularly outstanding resource.
Knowledge and articles from the Jewish world: As previously stated, when both the writers and editors are Jews, it is natural that quite extensive space will be devoted to Jewish topics and issues. This website provides an exceptionally rich source for study in this area, particularly as the majority of the newspapers did not focus merely on the communities in which they were published but rather examined the entire Diaspora. This website offers a vast collection of articles on the central events in the history of the Jews throughout the nineteenth and twentieth centuries, as well as on the processes which characterized their history, from emancipation and western immigration, to the Holocaust, immigration to Israel, the establishment of the State of Israel, the Jewish-Arab conflict, etc. Additionally, in different sections numerous details can be found on the communities from the area in which the newspaper or journal was published.
Literature, essays, and research studies: This website also includes compositions that are not journalistic in nature. Some of the newspapers and journals also include works of literature for adults and children. Several of the newspapers even went beyond the realms of standard journalism toward philosophy and essay-writing, and there were even some which published scientific articles in their publications.
Linguistic information: Every newspaper and journal, particularly if its publication extended over a long period of time, may serve as an important linguistic and philological database. In this context, the Hebrew-language newspapers are of particular interest as they reflect the development of modern Hebrew throughout the last two hundred years.
Genealogy and name searches: By assembling newspapers that were published by Jewish communities throughout the world, the Historical Jewish Press website constitutes an extremely important genealogical resource for people searching for relatives and reconstructing family trees. While the daily newspapers included in the website are outstanding for their reference to important and famous people, it is actually the local, less centralized newspapers that are likely to provide information on more average citizens.
Advertisements: The majority of newspapers which appear on this website were commercial publications; they would have been read by the general public and would therefore include advertisements which helped to finance the newspaper. The Historical Jewish Press website allows users to search the included publications' advertisements, thus revealing stylistic and technological changes in this area throughout the years.
The fact that the newspapers collected on this website are publications from the past should guide the user's operation and search. The user should pay attention to two primary areas of potential difficulty:
Since language is not static, but rather tends to change over time, it is reasonable to expect that names of places, institutions, and organizations would also change over time. The names of countries and cities are particularly prone to change, as a result of both changing borders and a desire to differentiate from the past.
Thus, for example, a large part of the territory of current-day Germany was previously named Prussia after the dominant kingdom in the region. It is therefore advisable that a user searching for articles related to this area will use both entries (i.e., "Germany" and "Prussia") in order to get the greatest possible return of relevant results. An additional example is the city of Istanbul, located in modern-day Turkey. Until 1930, the city's name was Constantinople, so any search of newspaper issues until that date using the word "Istanbul" will return no results whatsoever, whereas searching for "Constantinople" will indeed return results.
In addition to changes resulting from the passage of time, there are also differences based on language that need to be taken into account when searching through the newspapers on this website. The names of places, institutions, and organizations change significantly as they pass from one language to another. For instance, the user who searches in English-language newspapers for information on underground resistance in Israel must be aware of the differences in names between English and Hebrew. The National Military Organization (Irgun Tzeva'i Leumi) was known to most English speakers simply as the "Irgun," whereas the Organization of Fighters for the Freedom of Israel (Irgun Lochamei Cheirut Yisrael) was called "The Stern Gang" in English, after its leader Abraham Stern.
Our general recommendation, which is doubly valid in the event that your search returned few or no results, is to perform a brief inquiry regarding alternative names for the concepts you wished to search for in the newspapers. A quick search in one of the major search engines or online encyclopedias will very likely reveal linguistic and historical changes in the sought-after terms.
The project encourages other institutions and collections to join in the development of this website.
Developing a Section
The Historical Jewish Press website is divided into different linguistic, cultural, regional, and chronological sections, such as 19th Century Hebrew-language Press and Jewish Press from Arab Countries. It is our intention to develop additional sections: Yiddish Press, Jewish English-language Press, Ladino Press, and much more. We sincerely welcome all parties interested in developing a comprehensive section of this type, an undertaking which is generally most suitable to public or academic institutions and foundations. The partnership will be explicitly noted in the section and in all search results produced therein, as well as on the website's home page and in the list of acknowledgements.
Developing a National Sub-Section
Many sections include within their frameworks Jewish newspapers that were published in a number of different countries. For instance, the section of Jewish Press in Arab Countries, still in its initial stage of development, contains newspapers from both Morocco and Egypt. It is our intention to create sub-sections for each national press included in the larger, more inclusive sections. We sincerely welcome all parties interested in developing such a sub-section, an undertaking which is generally most suitable to community or national organizations and donors with an interest in a particular community. The partnership will be explicitly noted in the sub-section and in all search results produced therein, as well as in the list of acknowledgements.
Uploading a Newspaper to the Website
We sincerely welcome contributions for the purpose of adding a particular newspaper or collection of newspapers to the website. A contribution of this kind is especially suitable for parties with an interest in newspapers, and it can serve as a powerful memorial tribute to special people, events, or causes in the donor's life. Moreover, for those who have not yet chosen a newspaper which they would like to see added to the collection, the Historic Jewish Press website has an extensive list of newspapers, from which it is possible to choose according to personal or institutional preferences: country or city of publication, type of newspaper (daily, children's, political party), frequency of publication (daily, weekly, monthly), and scope of the publication (over-all number of pages). The name of the donor and/or the commemoration will be explicitly noted on the newspaper's webpage and in all search results produced therein.
Contributing Rare Newspapers
We sincerely welcome the contribution of rare newspapers and additional materials relating to the Jewish press or its history throughout the years. Contributions of this type will generally be referred to the National Library, although in special cases they may be directed towards other partners in the project.
We sincerely welcome all other contributions which may assist in the development of this website and in the advancement of its goals.
If you are interested in becoming a partner or otherwise assisting in the development of the project in any way, please contact us through the website or write to us at the following address:
Prof. Yaron Tsur,
Historical Jewish Press Site
Tel Aviv University
Tel Aviv (69978)
Two primary technologies form the basis of the computerization of texts—i.e., the conversion of printed material to digital file, or as it is professionally called, the digitalization of texts:
Writing identification technology (OCR—Optical Character Recognition)
Scanning means a photocopy of printed material converted to a simple picture file. In this sense, scanning a drawing and scanning a text yield exactly the same product, since the computer treats them both as images and does not distinguish between them based on content. For this purpose, writing identification technology (OCR) has been developed, which allows a picture file including text to be turned into a searchable text file. This is accomplished through the recognition of patterns of dots in a picture as letters within words. These two technologies (scanning and OCR) are relatively long-standing: scanning technology was successfully implemented at the end of the 1950s and writing identification technology was put to commercial use during the 1960s. However, these technologies have been greatly developed throughout the years, and the programs that implement them today are able to cope with a broad range of languages and fonts. Moreover, the percentage of identification for each and every language has significantly risen in recent years and—given that the quality of the original material and of the scanning are good—it is possible to achieve identification of over ninety percent accuracy.
In the transition from the digitalization of simple texts (for example, letters or official documents), to newspapers, the importance of a third technology becomes apparent:
Segmentation means the division of the scanned page into the distinct, logical sections from which it is assembled. In particular, with regard to a page from a newspaper, segmentation is the division of the page into the different articles present therein. Without this division, the newspaper page constitutes the smallest possible searchable unit, and the format of the search results will be based upon how many times the searched-for subject appears on the page. Clearly this method is highly problematic for the organization of search results, because within a newspaper the basic unit of information is not the page but rather the article, which is likely to take up only a small part of the page and may very well be continued on subsequent pages. Thanks to segmentation technology, the user can obtain search results based on the original articles and the relevance of each article to the subject on which the search was performed.
The scanning of a newspaper is done from one of three possible sources: paper, microfilm, or microfiche. Every effort is made to use the very best copy, which is determined by both quality and clarity of writing and the completeness of the inventory of the newspaper editions. This is no easy task, especially since newspapers undergo a constant process of wear and disintegration. In this sense, efforts to scan archival material, particularly historical newspapers, are part of a larger undertaking: the preservation of information and knowledge which might otherwise be lost forever.
The two additional technologies—writing identification and segmentation — operate when the Veridian Software software adapts the scanned pages into electronic versions of the newspaper. This stage, which is largely automatic, includes identification of all the different elements of each article, which—as stated above—is the fundamental building block of the newspaper:
Main body of article
Accompanying illustrations or photographs
Within each of these elements the searched-for word or phrase is identified, and to each result is assigned a corresponding level of relevance to the search. Thus, for instance, when we search for a certain keyword (e.g., a name) the system will give preference to articles in which that keyword appears in the title over other articles in which that same keyword merely appears in the main body of the article.
The final product of the processing stage is a vast collection of files which constitute the electronic version of a publication. Each article is composed of image files of the original document and text files of the content as identified by OCR. What the user sees when viewing an article is actually that article's image, whereas the identified text is posted "behind" that image. Presentation of the newspapers is achieved using Extensible Markup Language (XML) technology, which enables the future adaptation of the material to other platforms.
Although the three main technologies of which this website makes use (scanning, OCR, and segmentation) are mature and time-proven, they are still not perfect. Neither OCR nor segmentation identification has reached a 100% standard of accuracy, and the poorer the quality of the material, the lower the level of accuracy in identification. Because the Historical Jewish Press website works with newspapers from the past, and sometimes even the distant past, we are compelled to deal with many different phenomena that threaten to ruin the identification process. These phenomena include inferior quality of printing (which characterizes early publications), yellowing paper, marks or damage in the original printing, unique fonts, torn pages, scribbled-on pages, and even pages damaged by rodents.
Technological limitations combine with the limitations of the raw materials with which we work and are manifested in two primary problems that the user is likely to encounter: word identification errors and segmentation errors. Word identification errors appear either in the form of existing words that are not identified or in the form of words that are mistakenly identified. In the first case, the user will see that a certain word is present in the article but fails to come up during a textual search. In the second case, the user will see that the identified word is not the same as the word for which he searched. Both of these occurrences are known phenomena and need to be taken into account by the user. Despite this limitation, however, the chance of finding entries is not significantly hurt because for the most part the sought-for word or phrase appears more than once in the article and therefore even if an identification error occurs the first time that word appears in the article, there is an excellent chance that the second time identification will be successful and the article will appear on the list of search results.
The second problem the user may encounter is segmentation errors. Here, too, errors are likely to arise in one of two ways: either as identification of several articles together as one unit, or as identification of one article as several different articles. As a rule, segmentation errors are less critical than word identification errors because they do not prevent the finding of articles matching the sought-for concept but are simply likely to disrupt the organization of the results. Segmentation errors may, however, lead to a certain inconvenience for the user arising from the need to access the full page of the newspaper and identify the proper boundaries of the article. As much as possible, the Historical Jewish Press website makes every effort to minimize both word identification and segmentation errors.
In conclusion, it is important to remember that a search of the newspapers is done in the environment of free text, which means that if a certain query does not yield search results, or comes up with only a few suggestions, there is a good chance that the spelling of that query was not accurate. This is likely to occur because of a simple error in spelling or because in the past that word or phrase was spelled differently