Start Date

8-11-2016 2:10 PM

Description

This article outlines one in-house model for archiving and providing access to HTML-based news in the Kentucky Digital Newspaper Program (KDNP) at the University of Kentucky (UK). To allow for search and retrieval of HTML-based news in the KDNP which already contains news content digitized from analog sources, the encapsulation of HTML content using XML encoded CDATA strings read by a prototype open-source PHP viewer is described.

Notes

The downloadable item is a presentation-based article published in the conference proceedings. It has a different title (Archiving and Accessing HTML-Based Newspapers Using XML and CDATA Strings) and its copyright information is as follows:

Copyright © 2016 by Eric Weig. This work is made available under the terms of the Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0

Share

COinS
 
Aug 11th, 2:10 PM

Harvesting and Parsing an HTML-­based Newspaper

This article outlines one in-house model for archiving and providing access to HTML-based news in the Kentucky Digital Newspaper Program (KDNP) at the University of Kentucky (UK). To allow for search and retrieval of HTML-based news in the KDNP which already contains news content digitized from analog sources, the encapsulation of HTML content using XML encoded CDATA strings read by a prototype open-source PHP viewer is described.