DIVA software

 

TermDefine

Page history last edited by Steven Morris 1 yr ago

Term definitions

 

There are two type of terms that can be analyzed using DIVA:

    • Index terms These are terms associated with papers for purposes of indexing and searching. An index term has a binary link to a paper, it is either associated with the paper or not.
    • Vocabulary terms These are terms extracted from the parts of the paper. The number of occurrences of a vocabulary term in a paper can be counted, so links between vocabulary terms and papers can be weighted by their occurrence counts, rather being binary links.

 

DIVA can cluster and map ID index terms and DE index terms, which are supplied in WOS records.

 

Vocabulary terms consisting of 1-word, 2-word, or 3-word terms extracted from titles or abstracts, can be generated in DIVA's template database. DIVA's routines for input, clustering and mapping of these terms are not current and need to be rewritten and tested.

 

The specific types of terms that can be analyzed using DIVA are given below.

 

DE terms

 

DE terms are author-supplied index terms. These terms tend to suffer from problems of non-standardized term selection. Authors do not select several variant terms for each concept. Another problem is that some journals do no supply index terms, leaving gaps in the the coverage in a collection of papers. DE terms are provided in WOS records.

 

ID terms

 

ID terms are generated by ISI. They are probably generated using by extracting frequent terms in titles and possibly abstracts using a proprietary algorithm. ID terms them to have poor information content, that is, often the terms are too general in meaning be of use. ID terms are provided in WOS records.

 

1-word terms.

 

1-word terms are vocabulary terms consisting of single words extracted from text sections of the paper. In DIVA, 1-word terms can be extracted from titles and abstracts, using a Visual Basic program in the template database. 1-word terms tend to have little information content.

 

 

2-word terms.

 

2-word terms are vocabulary terms consisting of pair of adjacent words extracted from text sections of the paper. In DIVA, 2-word terms can be extracted from titles and abstracts, using a Visual Basic program in teh template database. 2-word terms typically have enough information content to be useful, especially in collections of papers from biomedicine, engineering, and hard sciences.

 

3-word terms.

 

3-word terms are vocabulary terms consisting of triples of adjacent words extracted in order from text sections of the paper. In DIVA, 3-word terms can be extracted from titles and abstracts, using a Visual Basic program in teh template database. 3-word terms typically have less information content than 2-word terms, but are sometimes useful for mapping biomedical specialties.

Comments (0)

You don't have permission to comment on this page.