Quantcast
Channel: AWOL - The Ancient World Online
Viewing all articles
Browse latest Browse all 14060

ToPan (multilingual topic modelling for Greek, Latin, Arabic and other languages)

0
0

ToPan (multilingual topic modelling for Greek, Latin, Arabic and other languages)

(Meletē)ToPān v.0.1
The name (Meletē)ToPān v.0.1 is based on the Greek principle μελέτη τὸ πᾶν which roughly translate to "take into care everything". I decided for the name because Topic-Modelling performs well on large amounts of logically structured chunks of texts and it helps selecting the interesting bits in a large corpus of text by technically having looked at everything. The butterfly in the logo is of the species Melete. The original photograph is by Didier Descouens and he has licensed it under CC BY-SA 4.0. I changed the image for the logo slightly. I'd strongly suggest to start with the original if you want to use it, but you can also use this now slightly modified logo under CC BY-SA 4.0 license as I am required to share it under the same license as the original image
ToPān is Topic-Modelling for everyone: from people without programming knowledge to people that want to build teaching and text-reuse tools and apps based on Topic-Modelling data without having to develop their own tool or having to majorly restructure their textual data. ToPān is made to be shared and used. That is why I tried to modularise ToPān in a way that in each step you could ingest your own data. It works best however, if you work your way from left to right: from "Data Input" to "LDA Tables" (please find more details under "Instructions"). ToPān works best with files that are structured according to the CTS/CITE architecture.
ToPān is also still under active development. This is an alpha release. More features will be added and you are encouraged to roadtest ToPān and send me feedback or report bugs.


Catullusfix stop word bug, create sample datasets11 days ago
Modelsfix stop word bug, create sample datasets11 days ago
wwwfix stop word bug, create sample datasets11 days ago
.gitignoreCreate .gitignore4 months ago
Catullus.Rfix stop word bug, create sample datasets11 days ago
LICENSECreate LICENSE3 months ago
Petronius.csvrecovery4 months ago
README.mdUpdate README.md3 months ago
Sandbox2.RDataexperimenting for switch from RCurl to httr3 months ago
Sandbox2.Rhistoryexperimenting for switch from RCurl to httr3 months ago
StemDic.rdsmajor updates and changes3 months ago
WordEmbedVec.Rfix stop word bug, create sample datasets11 days ago
app.Rfix stop word bug, create sample datasets11 days ago
caesar.csvfix stop word bug, create sample datasets11 days ago
catullus.csvfix stop word bug, create sample datasets11 days ago
copyright.mdupdate description3 months ago
corpus.rdsupdate4 months ago
dataentry.mdupdate description3 months ago
home.mdUpdate home.md3 months ago
message-handler.jsmajor updates and changes3 months ago
morphologicalnormalisation.mdupdate description3 months ago
phi0972.phi001Parsed.82xfimplement 82XF3 months ago
preliminary.mdupdate description3 months ago
sandbox.Rupdate4 months ago
sandbox2.Rimplement 82XF3 months ago
settingtmvalues.mdupdate description3 months ago
temp_vectors.binfix stop word bug, create sample datasets11 days ago
treebank.xmlupdate4 months ago

 


Viewing all articles
Browse latest Browse all 14060

Latest Images

Trending Articles





Latest Images