Landships II

Members Login
Username 
 
Password 
    Remember Me  
Post Info TOPIC: On-line newspaper archives


Legend

Status: Offline
Posts: 1152
Date:
On-line newspaper archives
Permalink   


Has this been mentioned before?  I can't find previous mention offhand.

James' mention of acquiring a Daily Mail page of November 2nd,  1914 reminds me - the on-line British newspaper archive is up and running, additional material is being added almost daily with an objective of archiving (digitising) "up to 40 million newspaper pages from the British Library's vast collection over the next 10 years."  Search site is:

http://www.britishnewspaperarchive.co.uk/

Searches are free but you have to pay or subscribe to get the page detail.  Digitising involves Optical Character Recognition which means searching on the rendered text can be somewhat more of an art than a science.

The Australian equivalent is:

http://trove.nla.gov.au/newspaper

That one is (so far) free but they appear to be a little over-loved at the moment ("We are experiencing some intermittent problems ..." often a sign that their server hasn't the "grunt" to handle the growing number of queries).  Trove allows viewers to log-in and correct the OCR text so the quality is potentially superior to most such resources.

Both resources support simple and advanced searches.  Similar facilities no doubt exist in numerous other countries.  Anyone knowing of these, please add the links and any comments you might have. 

Off topic for newspapers but Trove has other "collections" and is also, for instance, a shortcut way (from another search page) to search the various photographic collections (Australian War Memorial, sevral of the State Government collections, etc.).



__________________
Facimus et Frangimus


Colonel

Status: Offline
Posts: 248
Date:
Permalink   

There are problems, I find:

1. Early days yet - a lot of online newspaper archives in Britain still omit 1914-19. The older British Library Newspaper archive for the 19th century is far more complete than the new one about which there has been a lot of fuss. But it ends in 1900 (and the new one seems to end in 1903 anyway, so far, while the Highlands Council Library service also omits the Great War years). This will change however with time.

2. No doubt because of the London Government's obsession with charging for every public service, and because a few are in any case provided by the original newspaper, most or all are paying or subscription - unlike the very praiseworthy Aussie jobs mentioned here, which I have used for my quite separate academic research - bonzer chaps indeed. However, do check out the options in a major city reference library - they often have access via their desks or even their own website. (People with university library access have a big advantage here.) For instance, the Times has a more or less complete run, as does the Scotsman - both covering the GW years and available for pay through themselves but also free through some libraries. For instance, those who can register with the National Library of Scotland can get access to the Scotsman website through the NLS website www.nls.uk . Very, very useful.

3. Reliabillity of search. Watch the search engines. Firstly, some default to Keyword - which is selected by the editors - when a free text search would usually be more useful. Second, the automated optical software that compiles th eindices often doesn't work very well especially if the scan is rough. Be prepared to use multiple angles of attack and various keywords and combinations - and in extremis to read through the whole paper if something ought to be happening on date x but you can't find it on automatic search. Third, searching  for 'tank' for instance is a pain on its own - too many locomotives, bulk carrier ships, motor cars, sewage plants, weddings of Tank Corps chaps, etc. etc. - but using the wrong keyword will lose what you are looking for. And bear in mind that if you re paying by the article it will get very expensive very soon.

But it beats travelling to libraries to mess around with microfilm - and it helps solve the old problem of needing to know when and where something happened to find out about it at all.

 

 



__________________


Legend

Status: Offline
Posts: 1152
Date:
Permalink   

Thanks for the tips Mike, particularly the bit about registration with the NLS. In that vein, it should be mentioned that free registration with the National Library of Australia (NLA) for those who are able gives wider access to some "electronic resources", those to which the library has subscribed (their own internal resources are free).

Both the newspaper archives are, as you say presently limited but growing daily. With the Australian one I found it worthwhile checking every month or so for additional results. The British one might be similarly rewarding.

OCR errors. I'm sure there must be a website somewhere that records "the most comical OCR reading errors we've seen," what with all sorts of archives now on line (all those books in archive.org ...) and with OCR now very widespread for those. It would probably help divert the darker thoughts when struggling with the text searches smile.



__________________
Facimus et Frangimus


Lieutenant-Colonel

Status: Offline
Posts: 197
Date:
Permalink   

National Library of Singapore has good and free archive too. I have noticed that Singapore newspapers have published often absolutely same news as British newspapers.

http://newspapers.nl.sg/

 



__________________


Colonel

Status: Offline
Posts: 248
Date:
Permalink   

Most interesting replies - I'm keeping a close eye on this.

On a technical point, my university librarians are kindly checking the situation with the 'new' BL archive which rather oddly doesn't have an institutional subscription option on the website! So presumably early days.

I am waiting to see how - or indeed if - it connects to the older 19th century BL British Newspapers database with which it seems to overlap heavily and which is routinely available in UK academic libraries - as it happens, the overhanging bits are potentially critical to my (non-GW) research.

In particular, a key question is whether the 'new' stuff will be available on a cooperative multi-institution deal such as 'JISC' - which would be highly relevant to at least those of us who can use UK university and college  libraries etc.

I'll report back if I get any hard info.

Mike



__________________


Legend

Status: Offline
Posts: 1152
Date:
Permalink   

Thanks again Mike.

Not a newspaper but certainly a publication and one already well known to some forumites here is the London Gazette search. May as well note it as well, to keep the on-line references in one place:

http://www.london-gazette.co.uk/search

Superb reference, but with its very own set of OCR idiosycrasies (thinking of the pages where "small caps" as used in some surnames, etc. are not rendered in readable text at all though they are fine in other pages).  For all that, the Australian gazette (Commonwealth, haven't checked the various States) doesn't even come close, will leave that aside at this time).

Some members have found the AWM photographic collection and the honours and awards search useful.  Even further removed from the topic (on-line at least) but research often dives into such rabbit burrows:

http://www.awm.gov.au/collection/photographs/

http://www.awm.gov.au/research/people/honours_and_awards/

Steve



__________________
Facimus et Frangimus


Legend

Status: Offline
Posts: 1416
Date:
Permalink   

A very limited number of 20th Century newspapers on British newspapers archive so far, but I have found information I was previosuly lacking so it's already proved its worth for me. Worst OCR error so far? "Deceased" rendered as "Defeated".
Gwyn

__________________


Colonel

Status: Offline
Posts: 248
Date:
Permalink   

An update - I have now heard back from a librarian at my university library.

It appears that the new Brightsolid/British Library initiative mentioned above

http://www.britishnewspaperarchive.co.uk/

is seemingly for personal/individual subs only - institutional access is still not available (as far as the librarian could ascertain) and there is nothing yet about any proposals for this. I find this astonishing, nay astounding, given that the BL is involved ...

Anyway, what this means is that there is no access through your national, university or local council library, on site or online via their website. This is a big disappointment especially as it includes the scans from the earlier batches of digitised newspapers that were done with public money in the form of research council grants. This latter original and separate BL database "British Newspapers 1700-1900" database will NOT be extended even when they scan new newspapers/years for the new commercial setup. THis is a big shame as it will automatically miss out 1914-18.

I'll let you know if the situation changes (as should anyone else who hears news). Meanwhile, yet another way to spend the bawbees.


__________________


Legend

Status: Offline
Posts: 1626
Date:
Permalink   

L'Illustration french illusrated newpaper, includes an archive...

http://www.lillustration.com/

Cheerssmile



__________________

"Ash nazg durbatulûk, ash nazggimbatul, ash nazg thrakatulûk, agh burzum-ishi krimpatul"

 



Colonel

Status: Offline
Posts: 248
Date:
Permalink   

I'm trying out britishnewspaperarchive.co.uk (commercial partner = Brightsolid) with decidedly mixed results.

I have suddenly realised that it seemingly reuses Brightsolid's interface for the National Archives of Scotland for Scottish statutory personal data - birth, marriage, death, census etc. as on scotlandspeople.gov.uk. Now this last works pretty well (and I prefer it to the English equivalent on ancestry.co.uk in many ways). But it doesn't translate so well over to newspapers which have longer text than the average census entry, if you aren't simply looking for your ancestor's wedding notice.

I also find the output in terms of imagery very poor for newspapers. Has anyone, please, managed to work out how to print out/save an article alone as an image without having to (a) machine-convert to plaintext, which I don't want to do because of the error risk or (b) convert to PDF, which means further work and cost in terms of Adobe software to convert to a normal image that I can edit and crop down?

Unless I am missing something, my overall assessment would be that it has some newspapers and/or dates not available on the original BL British Newspapers 1700-1900 (commercial partner = Gale) setup. (But the latter has a lot not available on the new site, and/or its search function is so rough that I'm not getting a lot that comes over onto the original BL database - not sure which.)

So to be fair the new site has already proved well worth the Ł6.95 for a trial 2 day period as it coughed up some letters to the editor I needed for the 'day job' research (1840s history) - compare the time and trouble and cost to travel to a library with microfilms of Dorset newspapers. But - again unless I am missing something - I would always use the original BL database first and use the new one for items not on the original database. However, as this older database does not extend beyond 1900 it may not be much use to many of us in this field!



__________________


Legend

Status: Offline
Posts: 1626
Date:
Permalink   

As an alternative to Abode I can recommend Foxit PDF reader uses the same basic software but is smaller has more features and runs smoother, for creating PDFs I would recommend PDF Redirect, creates pdf files instead of printing hence the redirect bit, both are free again easy to use...

http://www.foxitsoftware.com/Secure_PDF_Reader/

http://download.cnet.com/PDF-ReDirect/3000-10743_4-10255233.html

Hope this is of some help

 

Cheerssmile

 



__________________

"Ash nazg durbatulûk, ash nazggimbatul, ash nazg thrakatulûk, agh burzum-ishi krimpatul"

 



Colonel

Status: Offline
Posts: 248
Date:
Permalink   

Thanks Ironsides - very useful to know about for other purposes.

However the problem here is that the site will seemingly only produce a pdf - you don't get the option to divert it to say a jpg or image. So I'm still stuck with cutting out bits of pdfs using the Adobe "save bit in blue rectangle" function and pasting them into Word documents ... could be worse I suppose!



__________________


Colonel

Status: Offline
Posts: 248
Date:
Permalink   

An update. After a polite complaint to britishnewspaperarchive - whose chap to his credit consulted within the organization and got back to me very quickly - it emerges that the 'print etc article' (as opposed to whole newspaper page) function was never there at all despite clear instructions to do so! Anyway I gather they have amended that now. I'm astounded nobody seems to have pulled this up before ...

I have sent a polite but very definite complaint about its absence at all (never mind what the instructions say).

However, the site does give access to some periodicals not otherwise accessible except through a long session with the microfilm (and associated travel costs).

Mike



__________________


Legend

Status: Offline
Posts: 1152
Date:
Permalink   

Excellent Mike, thanks for persevering with that, improvements may follow.

I may have mentioned it before but I use the (free) German IrfanView graphics utility which handles PDF and converts to/between graphics formats (it even has OCR but I haven't had a lot of luck with that, though, being German, it supposedly handles Fraktur). Anyway, it is (almost) infinitely "extensible" through various "add-ons" and even combines separate PDF pages into a single document, just like a "bought one", when the appropriate GhostScript utility is present (separate but also free). If you've ever downloaded an archived article from Flight Magazine ... IrfanView is/can be quite complex but it is a comprehensive tool and there is a users' forum out there somewhere (which is where I got the information to find and install and configure GhostScript).

Actually, the "best" OCR I have that doesn't involve additional cost is the one which comes with my multi-function printer's scanning software - whenever I can remember how to point it at an image file that isn't being created by scan. I'm supposing most/all "multi-function" printers have the same (or are even easier to bend to one's will). But it doesn't handle the Fraktur. And I've forgotten again. Hopefully I've kept a note somewhere ... just have to find it.

__________________
Facimus et Frangimus


Colonel

Status: Offline
Posts: 248
Date:
Permalink   

Many thanks for that - I may have to go down that route if it comes to that (and if I find I don't after all need Acrobat full or at least partial functionality for other reasons such as proof correcting). And it will be of interest to others.

incidentally, the "print article rather than whole page" problem is of course the same problem as "save article" - but as you say we will see what happens in terms of the site owners doing something about it, or not as the case may be (though I did point out to them that their competitors had no problem there at all). The site does have a fair bit of post-1900 content though it's still patchy.



__________________


Legend

Status: Offline
Posts: 1152
Date:
Permalink   

Hopefully you won't need it Mike - but worth remembering just in case.  Oh I should mention IrfanView appears to be Austrian, not German .



__________________
Facimus et Frangimus


Colonel

Status: Offline
Posts: 248
Date:
Permalink   

A bit of news - the National Library of Scotland is allowing access to British Newspaper Archive for users physically on their (Edinburgh) premises, though not (alas) remotely online. It does show that BNA have evidently changed their business model a bit to allow institutional subs of at least some kind. Might be worth keeping an eye open if you use a national, institutional or major local authority library.

I still will use my personal sub for online access from home, however - it's worth it for other research that I do.

__________________
Page 1 of 1  sorted by
 
Quick Reply

Please log in to post quick replies.

Tweet this page Post to Digg Post to Del.icio.us


Create your own FREE Forum
Report Abuse
Powered by ActiveBoard