The Construction of Application-specific and Index-supported String Similarity Predicates
| AUTHOR | Reckhemke, AndrĀ; Reckhemke, Andre |
| PUBLISHER | VDM Verlag Dr. Mueller E.K. (02/19/2008) |
| PRODUCT TYPE | Paperback (Paperback) |
Description
In times of worldwide globalisation the knowledge of useful information is becoming increasingly important. Parallel to genetic engineering, the expansion of the Internet produces similar volumes of data - frequently saved in text files. One of the most relevant intersection is the usage of approximate string matching in large text data. The Internet has to face the challenge of not only to concentrating on request times but also finding more context-relevant information. Associated with this aim, further steps in this field have to take into consideration that documents can include mistakes in orthography or words being abbreviated. Other areas of information are substituted with their acronyms or are less important and can be ignored. All of these tasks are united in the fields of computational linguistics. This master thesis shows stepwise the tokenising of real text, the homogenisation of words, and the storage in a specific index structure for subsequent approximate string matching - in consideration of secondary storage. A prototype programmed in Java completes the current work.
Show More
Product Format
Product Details
ISBN-13:
9783836466387
ISBN-10:
3836466384
Binding:
Paperback or Softback (Trade Paperback (Us))
Content Language:
English
More Product Details
Page Count:
88
Carton Quantity:
50
Product Dimensions:
6.69 x 0.18 x 9.61 inches
Weight:
0.34 pound(s)
Country of Origin:
US
Subject Information
BISAC Categories
Computers | General
Descriptions, Reviews, Etc.
publisher marketing
In times of worldwide globalisation the knowledge of useful information is becoming increasingly important. Parallel to genetic engineering, the expansion of the Internet produces similar volumes of data - frequently saved in text files. One of the most relevant intersection is the usage of approximate string matching in large text data. The Internet has to face the challenge of not only to concentrating on request times but also finding more context-relevant information. Associated with this aim, further steps in this field have to take into consideration that documents can include mistakes in orthography or words being abbreviated. Other areas of information are substituted with their acronyms or are less important and can be ignored. All of these tasks are united in the fields of computational linguistics. This master thesis shows stepwise the tokenising of real text, the homogenisation of words, and the storage in a specific index structure for subsequent approximate string matching - in consideration of secondary storage. A prototype programmed in Java completes the current work.
Show More
Your Price
$62.84
