Archived

This content is available here strictly for research, reference, and/or recordkeeping and as such it may not be fully accessible. If you work or study at University of Kentucky and would like to request an accessible version, please use the SensusAccess Document Converter.

Start Date

8-11-2016 2:10 PM

Description

This article outlines one in-house model for archiving and providing access to HTML-based news in the Kentucky Digital Newspaper Program (KDNP) at the University of Kentucky (UK). To allow for search and retrieval of HTML-based news in the KDNP which already contains news content digitized from analog sources, the encapsulation of HTML content using XML encoded CDATA strings read by a prototype open-source PHP viewer is described.

Notes

The downloadable item is a presentation-based article published in the conference proceedings. It has a different title (Archiving and Accessing HTML-Based Newspapers Using XML and CDATA Strings) and its copyright information is as follows:

Copyright © 2016 by Eric Weig. This work is made available under the terms of the Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0

Share

COinS
 
Aug 11th, 2:10 PM

Harvesting and Parsing an HTML-­based Newspaper

This article outlines one in-house model for archiving and providing access to HTML-based news in the Kentucky Digital Newspaper Program (KDNP) at the University of Kentucky (UK). To allow for search and retrieval of HTML-based news in the KDNP which already contains news content digitized from analog sources, the encapsulation of HTML content using XML encoded CDATA strings read by a prototype open-source PHP viewer is described.