[
Retour à la fiche]
Metadonnées au format DC
<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Language detector</dc:title>
<dc:creator/>
<dc:subject>linguistics</dc:subject>
<dc:description>Trouve la langue d'un texte grâce à l'algorithme suivant :
* finds among the 20 most common characters of the text the most common unicode category. If this category is a letter (category starting with “L” : Ll=letter lower case, Lu=letter upper case, Lo=letter other) continue, if not (ie. Mostly other characters like ponctuation) give up.
* Check among the 20 most common characters if the first word of the unicode name gives a unique language name</dc:description>
<dc:type>software</dc:type>
<dc:format/>
<dc:source/>
<dc:identifier>http://elizia.net/languageDetector/languageDetector.html</dc:identifier>
<dc:language/>
<dc:rights/>
<dc:relation/>
</metadata>