- Page created by Daniel, 30 May 2008: mv from [[WikiWord]]
- Last modified by Daniel, 1 October 2008: /* Related Projects */ PowerSet http://www.powerset.com/
WikiWord is a project to build a Thesaurus by extracting lexical and semantic information using text mining and data mining techniques. This page is a loose collection of material and ideas about that project.
A short expose is at WikiWord/Expose (German).
Contents |
About WikiWord
- Some ideas presented at Wikimania 2005: http://meta.wikimedia.org/wiki/Transwiki:Wikimania05/DK1
- Some more presentations: http://brightbyte.de/papers/2005, http://brightbyte.de/papers/2006
- Current presentations and papers: http://brightbyte.de/papers/2008
- Old prototype code (partially broken): http://tools.wikimedia.de/~daniel/rawview/rawview.php/WikiSense-trunk/wikiword
- Rewrite in Java in progress. Release planned for summer 2008
Papers
- My papers on bibsonomy: wikipedia, wikiword
- OAIster search for wikipedia: [1]
- http://www.aifb.uni-karlsruhe.de/Forschungsgruppen/WBS/Publications/
- Collection of wikipedia studies at AI3: http://www.mkbergman.com/?p=417 (-> http://www.mkbergman.com -> Zitgist!)
- wp:Wikipedia:Wikipedia_in_academic_studies: more than 200 papers about wikipedia
- Making WikiMedia resources more useful for translators (Wikimania 2007): [2], [3], [4] -- Alain Desilets (National Research Council of Canada)
Wikimania 2008
- Mining & Semantics: Wikipedia Mining [5] (!!!)
- Translation & Mining: Analyzing Interlanguage links of Wikipedias [6] (!!!)
- Mining & Linking: Wikitag [7] (!!!)
- Mining & semantics: Integrating Wikipedia into a global network of Semantic Web for advanced information access [8] (!!!)
Related Projects
- Wortschatz WP - http://wortschatz.uni-leipzig.de/WP/ wp:de:Benutzer:Meep
- knewco - http://www.knewco.com/
- OmegaWiki - http://www.omegawiki.org/
- DBWiki - http://www.dbwiki.de/
- Semantic MediaWiki - http://meta.wikimedia.org/wiki/Semantic_MediaWiki - http://ontoworld.org
- Semantic wikipedia - http://en.wikipedia.org/wiki/Wikipedia:Semantic_Wikipedia
- FreeBase - http://www.freebase.com
- OntoWiki - http://ontowiki.net
- DBPedia http://dbpedia.org/ http://wikipedia.3ba.se/ - Sören Auer, Jens Lehmann; Uni Leipzig
- WikiData link collection: http://meta.wikimedia.org/wiki/Wikidata
- patbam doing translit / parallel texts: http://ruphus.com/svn/wikialign/ - also see blog on Wikipedia, MediaWiki, Linguistics, and i18n: http://blogamundo.net/dev/ - personal bib: http://www.citeulike.org/user/snifty
- UMBEL (AI3)
- Wortsurfer
- http://wikipedia-lab.org/en/index.php/Main_Page - Wikipedia laboratory is a special interest group on Wikipedia mining.
- http://wikixmldb.dyndns.org/ - Wikipedia vis XQuery
- BabelWiki / WikiTerm -- CFP: http<nospam>://babelwiki.notlong.com
- http://www.wiki-translation.com/tiki-index.php Wiki-Translation
- http://hilt.cdlr.strath.ac.uk/hilt4/index.html - HILT - High-Level Thesaurus (Anu Joseph)
- http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/ YAGO, F.M. Suchanek
- PowerSet http://www.powerset.com/
Classifications
Mining Identifiers from various classification systems could be the next big step for WikiWord. Here are a few:
- Medicine, Biology, Chemistry:
- CAS-Number
- EC-Number
- ICD-10
- PubChem, DrugBank
- ATC
- IUPAC
- Biopharmazeutische Klassifizierungssystem (BCS)
- Mathematics, Informatics
- Mathematics Subject Classification (MSC)
- Austronomie
- Bright-Star-Katalog
- Henry-Draper-Katalog
- SAO-Katalog
- Tycho-Katalog
- Hipparcos-Katalog
- Publications
- ISBN, ISSN
Further References
- Cfp: Wikipedia Workshop @ AAAI 2008: http://lit.csci.unt.edu/~wikiai08/index.php/Main_Page
- Cfp: 2nd Workshop on Social Aspects of the Web (SAW 2008): http://bis.kie.ae.poznan.pl/11th_bis/wscfp.php?ws=saw2008 (One topic: Mining formal semantics from social sources)
- public bookmarks for WikiWord: http://del.icio.us/brightbyte/WikiWord
- http://www.dwds.de/ Wörterbuch der Deutschen Sprache
- http://www.cl.uni-heidelberg.de/~frank/kurse/ss07/CrossLingualMethods/ Cross-Lingual Methods
- http://www.citeulike.org/user/snifty/article/1336404 Poor Man’s Stemming: Unsupervised Recognition of Same-Stem Words [9]
- SABRE Conference on Social Semantic Web: CSSW07, http://sabreconference.wifa.uni-leipzig.de/frontend/index.php?folder_id=43
- http://www.openlinksw.com/ RDF/RDBMS federation (Virtuoso)
- Conceptualize: http://en.wikiversity.org/wiki/Conceptualize:_A_Wikiversity_Learning_Project
- Open Text Mining - http://opentextmining.org/wiki/Main_Page
- WikiProject Microformats - http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Microformats
- Semantic Queries on Wikipedia: http://wikipedia.3ba.se/
- Live updates from Wikipedia: http://meta.wikimedia.org/wiki/Wikimedia_update_feed_service
- TMRA: http://www.informatik.uni-leipzig.de/~tmra/2007/
- Topincs (Robert Cerny)
- Topic Maps Wiki - The Wiki Way of Knowledge Management (T. Redmann, H. Thomas, TU Ilmenau) (!)
- WiKeyPoDia http://www.keypointdialog.org
- Automatic Topic Maps generation from Free Text (Ann Houston) (!)
- Knowledge Represenation of Distributed Biomediacal Information (V. Stümpflen, K. Nenova, T. Barnickel) -> KnewCo!
- Hypertopic (L. Zaher, J.-P. Cahier, C. Guittard)
- fuzzzy.com (R. Lachica, D. Karabeg)
- Microformats for Wikipedia: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Microformats
- The Open Knowledge Foundation (wiki)
- Wikipedianer
- http://www.glottopedia.org/
- Daniel Bauer, Uni Osnabrück (!)
- http://www.glottopedia.org/
Standards/Formats
- Thesauri: ISO 2788, ISO 5964
- LOM/RDFS
- SWRC
- SKOS core guide
- TBX [10] [11]
- LMF




(no comments yet)