Patrick Herron
Patrick Herron School of Library and Information Science, The University of North Carolina, Chapel Hill, North Carolina
Master's candidate in Information Science, M.S.I.S. with a Computer Science Minor and Bioinformatics Certificate anticipated August 2006
Goal: To study text-based discovery and develop text mining systems
Research Courses Personal

As of November 2006 I have completed work on my master's thesis on text mining adoption and innovation at a large pharmaceutical company (full text). The thesis has three main components: (a) a theoretical treatment of text mining; (b) a review of business and scientific applications of text mining; and (c) a case study of text mining adoption for pharmacogenomics (PGx) drug discovery. In the thesis I have developed a quality model for evaluating novel drug discovery information generated (rather than merely extracted) from multiple literature and data inputs. In the thesis I have also provided a new way of defining text mining as distinct from data mining, information retrieval, and information extraction. My thesis advisor is Dr. Stephanie Haas.

From 2004-2006 I experimented with different concept-based feature representations for automatic text classification and clustering tasks using the NC Health Info community health website collection as a corpus. The goal of the experiments was to automate the generation of both index & topic term sets for information architecture and cataloging tasks. NC Health Info is a joint project of SILS and UNC Health Sciences Library funded by a grant from the National Library of Medicine.

Organization of Information
Knowledge Discovery
Bioinformatics Seminar
Information Retrieval
Data Mining
Strategic Information
Natural Language Processing
Systems Analysis
Decision Theory
Advanced WWW Programming
Data Structures
Curriculum Vitae (poetry)
Knowledge Discovery notes
Information Organization notes
The American Godwar Complex
Writing Bio
In the news...
Selected Literary Works
Sofia & Booker
Carrboro Poetry Festival
Selected Papers and Presentations

Text Mining

- Automatic Text Classification of Consumer Health Web Sites Using WordNet
- Using WordNet in Document Clustering of a Consumer Health Web Collection
- Beyond Relevance: Text Mining in 2010

Text & Data Mining-related

- Machine Learning for Medical Decision Support: Evaluating Diagnostic Performance of Machine Learning Classification Algorithms
- Scanning The Literature for Pharmacogenomics Knowledge and Information Overload in Drug Discovery: A Review
- An Evaluation of Three Language Parsers
- Designing metaschemas for the UMLS enriched semantic network

Other Topics in Information Science

- Traffic Cameras Fail To Prevent Moral Hazard
- Incompleteness, indeterminacy, and holism in the construction of crosswalks, or why there's no substitute for knowing your data
- We Can Remember It For You Wholesale: An evaluation of Google Desktop as a Personal Information Management tool
- SONIX: UNC School of Nursing Information Exchange Specification

Locations of visitors to this page