Golden Nuggets at Berlin Buzzwords 2014
If you haven’t heard of Berlin Buzzwords, it’s the conference covering all the latest buzzwords surrounding the biggest buzzword of them all: Big Data. And boy is it buzzwordy! I’m …
Artificial Intelligence Researcher, Software Engineer, Entrepreneur
I work as a research scientist at the Goldman Sachs AI research group, on Natural Language Understanding and Knowledge Graphs. In the past, I co-founded two startups (Ambiverse and nudge:nudge) with the goal of creating machines that organize knowledge with AI methods. I received my PhD from the Max Planck Institute for Informatics, on the topic of discovering and disambiguating named entities in text.
Have a look at my publications and projects.
My research interest is in natural language understanding and knowledge base construction, in the Web and in finance.
@inproceedings{Mulang:2020jb,
author = {Mulang, Isaiah Onando and Singh, Kuldeep and Prabhu, Chaitali and Nadgeri, Abhishek and Hoffart, Johannes and Lehmann, Jens},
title = {{Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation Models}},
booktitle = {Proceedings of the 29th ACM International Conference on Information and Knowledge Management, CIKM 2020},
year = {2020},
pages = {2157--2160},
publisher = {ACM},
address = {New York, NY, USA}
}
@article{DelCorro:2020jb,
author = {Del Corro, Luciano and Hoffart, Johannes},
title = {{Unsupervised Extraction of Market Moving Events with Neural Attention.}},
journal = {CoRR},
year = {2020},
volume = {cs.CL},
pages = {2157--2160}
}
@article{Weikum:2019is,
author = {Weikum, Gerhard and Hoffart, Johannes and Suchanek, Fabian M},
title = {{Knowledge Harvesting: Achievements and Challenges}},
journal = {Computing and Software Science},
year = {2019},
volume = {10000},
number = {1},
pages = {217--235}
}
@techreport{Balada:2018vn,
author = {Balada, Christoph and Bellanova, Alexandra and Bruss, Michael and Buchberger, Stefan and Cirullies, Jan and Del Corro, Luciano and Festag, Reinhard and Fuhs, Gregor and Goetze, Stephan and Gressling, Thorsten and Havemann, Maike and Hoffart, Johannes and Holtel, Stefan and Hufenstuhl, Andreas and Kraus, Wener and Pfleger, Norbert and Pikus, Yevgen and Altran, Robin and Plumbaum, Till and Rolletschek, Gerhard and Satow, Lars and Schnakenburg, Igor and Siepmann, Ralph and Steffner, Rupert and Shozo, Moritz Takaya and Weber, Matthias and Wieczorek, Sebastian and Wittenburg, Georg},
title = {{Digitalisierung gestalten mit dem Periodensystem der K{\"u}nstlichen Intelligenz}},
year = {2018}
}
@inproceedings{Agarwal:2018tb,
author = {Agarwal, Prabal and Str{\"o}tgen, Jannik and Del Corro, Luciano and Hoffart, Johannes and Weikum, Gerhard},
title = {{diaNED: Time-Aware Named Entity Disambiguation for Diachronic Corpora}},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, ACL 2018},
year = {2018},
pages = {686--693}
}
@inproceedings{Seyler:2018vc,
author = {Seyler, Dominic and Dembelova, Tatiana and Del Corro, Luciano and Hoffart, Johannes and Weikum, Gerhard},
title = {{A Study of the Importance of External Knowledge in the Named Entity Recognition Task}},
booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia},
year = {2018},
pages = {241--246}
}
@article{Seyler:2017ww,
author = {Seyler, Dominic and Dembelova, Tatiana and Del Corro, Luciano and Hoffart, Johannes and Weikum, Gerhard},
title = {{KnowNER - Incremental Multilingual Knowledge in Named Entity Recognition}},
journal = {CoRR},
year = {2017},
volume = {cs.CL}
}
@inproceedings{Rebele:2016vx,
author = {Rebele, Thomas and Suchanek, Fabian M and Hoffart, Johannes and Biega, Joanna and Kuzey, Erdal and Weikum, Gerhard},
title = {{YAGO - A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames.}},
booktitle = {Proceedings of the 15th International Semantic Web Conference, ISWC 2016},
year = {2016},
pages = {177--185},
publisher = {Springer International Publishing},
address = {Cham}
}
@inproceedings{Hoffart:2016bp,
author = {Hoffart, Johannes and Milchevski, Dragan and Weikum, Gerhard and Anand, Avishek and Singh, Jaspreet},
title = {{The Knowledge Awakens: Keeping Knowledge Bases Fresh with Emerging Entities}},
booktitle = {Proceedings of the 25th International Conference Companion on World Wide Web, WWW 2016, Montreal, Canada},
year = {2016},
pages = {203--206},
publisher = {International World Wide Web Conferences Steering Committee},
month = apr
}
@article{Weikum:2016vn,
author = {Weikum, Gerhard and Hoffart, Johannes and Suchanek, Fabian},
title = {{Ten Years of Knowledge Harvesting: Lessons and Challenges}},
journal = {IEEE Data Eng. Bull.},
year = {2016},
volume = {39},
number = {3},
pages = {41--50}
}
@inproceedings{Schmidt:2016kr,
author = {Schmidt, Andreas and Hoffart, Johannes and Milchevski, Dragan and Weikum, Gerhard},
title = {{Context-Sensitive Auto-Completion for Searching with Entities and Categories}},
booktitle = {Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - Systems Demonstrations},
year = {2016},
pages = {1097--1100},
publisher = {ACM Press}
}
@inproceedings{Ernst:uv,
author = {Ernst, Patrick and Siu, Amy and Milchevski, Dragan and Hoffart, Johannes and Weikum, Gerhard},
title = {{DeepLife: An Entity-aware Search, Analytics and Exploration Platform for Health and Life Sciences}},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics - System Demonstrations},
year = {2016},
pages = {19--24}
}
@inproceedings{Singh:2016du,
author = {Singh, Jaspreet and Hoffart, Johannes and Anand, Avishek},
title = {{Discovering Entities with Just a Little Help from You}},
booktitle = {Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, USA},
year = {2016},
pages = {1331--1340},
publisher = {ACM Press}
}
@phdthesis{Hoffart:2015wk,
author = {Hoffart, Johannes},
title = {{Discovering and Disambiguating Named Entities in Text}},
year = {2015},
month = feb
}
@inproceedings{Hoffart:2015dr,
author = {Hoffart, Johannes and Preda, Nicoleta and Suchanek, Fabian M and Weikum, Gerhard},
title = {{Knowledge Bases for Web Content Analytics}},
booktitle = {Tutorial at WWW 2015, Florence, Italy},
year = {2015},
pages = {1--1}
}
@inproceedings{Hoffart:2014dt,
author = {Hoffart, Johannes and Milchevski, Dragan and Weikum, Gerhard},
title = {{STICS: Searching with Strings, Things, and Cats}},
booktitle = {The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2014, Gold Coast, QLD, Australia},
year = {2014},
pages = {1247--1248}
}
@inproceedings{Hoffart:2014cy,
author = {Hoffart, Johannes and Milchevski, Dragan and Weikum, Gerhard},
title = {{AESTHETICS: Analytics with Strings, Things, and Cats}},
booktitle = {Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China},
year = {2014}
}
@inproceedings{Nguyen:2014wl,
author = {Nguyen, Dat Ba and Hoffart, Johannes and Theobald, Martin and Weikum, Gerhard},
title = {{AIDA-light: High-Throughput Named-Entity Disambiguation}},
booktitle = {Linked Data on the Web at WWW2014},
year = {2014}
}
@inproceedings{Hoffart:2014hp,
author = {Hoffart, Johannes and Altun, Yasemin and Weikum, Gerhard},
title = {{Discovering emerging entities with ambiguous names}},
booktitle = {Proceedings of the 23rd international conference on World wide web, WWW 2014, Seoul, South Korea},
year = {2014},
pages = {385--396}
}
@inproceedings{Seufert:2013tx,
author = {Seufert, Stephan and Bedathur, Srikanta J and Hoffart, Johannes and Gubichev, Andrey and Berberich, Klaus},
title = {{Efficient Computation of Relationship-Centrality in Large Entity-Relationship Graphs}},
booktitle = {Posters and Demonstrations Track of the 12th International Semantic Web Conference, ISWC 2013, Sydney, Australia},
year = {2013},
pages = {1--4}
}
@inproceedings{Wang:2013wx,
author = {Wang, Yafang and Jian, Lili and Hoffart, Johannes and Weikum, Gerhard},
title = {{YaLi: a Crowdsourcing Plug-In for NERD}},
booktitle = {SIGIR 2013, Dublin, Ireland},
year = {2013}
}
@inproceedings{Yosef:2013vb,
author = {Yosef, Mohamed Amir and Bauer, Sandro and Hoffart, Johannes and Spaniol, Marc and Weikum, Gerhard},
title = {{HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text}},
booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Sofia, Bulgaria},
year = {2013},
pages = {133--138}
}
@inproceedings{Jiang:2013tw,
author = {Jiang, Lili and Wang, Yafang and Hoffart, Johannes and Weikum, Gerhard},
title = {{Crowdsourced Entity Markup}},
booktitle = {CrowdSem Workshop at the 12th International Semantic Web Conference, ISWC 2013, Sydney, Australia},
year = {2013}
}
@inproceedings{Hoffart:2013wk,
author = {Hoffart, Johannes},
title = {{Discovering and Disambiguating Named Entities in Text}},
booktitle = {PhD Symposion at ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York City, USA},
year = {2013},
pages = {43--48}
}
@inproceedings{Suchanek:2013vd,
author = {Suchanek, Fabian M and Hoffart, Johannes and Kuzey, Erdal and Lewis-Kelham, Edwin},
title = {{YAGO2s: Modular High-Quality Information Extraction with an Application to Flight Planning}},
booktitle = {15. GI-Fachtagung Datenbanksysteme f{\"u}r Business,
Technologie und Web},
year = {2013}
}
@inproceedings{Hoffart:2013ww,
author = {Hoffart, Johannes and Suchanek, Fabian M and Berberich, Klaus and Weikum, Gerhard},
title = {{YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract}},
booktitle = {23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, Beijing, China},
year = {2013},
pages = {3161--3165}
}
@article{Hoffart:2013hn,
author = {Hoffart, Johannes and Suchanek, Fabian M and Berberich, Klaus and Weikum, Gerhard},
title = {{YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia}},
journal = {Artificial Intelligence},
year = {2013},
volume = {194},
pages = {28--61},
month = jan
}
@inproceedings{Yosef:2012tz,
author = {Yosef, Mohamed Amir and Bauer, Sandro and Hoffart, Johannes and Spaniol, Marc and Weikum, Gerhard},
title = {{HYENA: Hierarchical Type Classification for Entity Names}},
booktitle = {Proceedings of the 24th International Conference on Computational Linguistics, Coling 2012, Mumbai, India},
year = {2012},
pages = {1361--1370}
}
@inproceedings{Hoffart:2012vx,
author = {Hoffart, Johannes and Seufert, Stephan and Nguyen, Dat Ba and Theobald, Martin and Weikum, Gerhard},
title = {{KORE: Keyphrase Overlap Relatedness for Entity Disambiguation}},
booktitle = {Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, Hawaii, USA},
year = {2012},
pages = {545--554}
}
@article{Weikum:2012wb,
author = {Weikum, Gerhard and Hoffart, Johannes and Nakashole, Ndapandula and Spaniol, Marc and Suchanek, Fabian and Yosef, Mohamed Amir},
title = {{Big Data Methods for Computational Linguistics}},
journal = {IEEE Data Eng. Bull.},
year = {2012},
volume = {35},
pages = {46--55}
}
@inproceedings{Hoffart:2011a,
author = {Hoffart, Johannes and Yosef, Mohamed Amir and Bordino, Ilaria and F{\"u}rstenau, Hagen and Pinkal, Manfred and Spaniol, Marc and Taneva, Bilyana and Thater, Stefan and Weikum, Gerhard},
title = {{Robust Disambiguation of Named Entities in Text}},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, Scotland},
year = {2011},
pages = {782--792}
}
@inproceedings{Hoffart:2011,
author = {Hoffart, Johannes and Suchanek, Fabian M and Berberich, Klaus and Lewis-Kelham, Edwin and de Melo, Gerard and Weikum, Gerhard},
title = {{YAGO2: Exploring and Querying World Knowledge in Space, Context, and Many Languages}},
booktitle = {Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, Hyderabad, India},
year = {2011},
pages = {229--232},
publisher = {ACM}
}
@inproceedings{Yosef:2011,
author = {Yosef, Mohamed Amir and Hoffart, Johannes and Bordino, Ilaria and Spaniol, Marc and Weikum, Gerhard},
title = {{AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables}},
booktitle = {Proceedings of the 37th International Conference on Very Large Databases, VLDB 2011, Seattle, WA, USA},
year = {2011},
pages = {1450--1453}
}
@techreport{Hoffart:2010,
author = {Hoffart, Johannes and Suchanek, Fabian M and Berberich, Klaus and Weikum, Gerhard},
title = {{YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia}},
year = {2010},
address = {Saarbr{\"u}cken, Germany},
month = nov
}
@inproceedings{Hoffart:2009,
author = {Hoffart, Johannes and Zesch, Torsten and Gurevych, Iryna},
title = {{An Architecture to Support Intelligent User Interfaces for Wikis by Means of Natural Language Processing}},
booktitle = {Proceedings of the 5th International Symposium on Wikis and Open Collaboration, WikiSym 2009, Orlando, FL, USA },
year = {2009},
publisher = {ACM},
month = oct
}
@inproceedings{Hoffart:2009a,
author = {Hoffart, Johannes and B{\"a}r, Daniel and Zesch, Torsten and Gurevych, Iryna},
title = {{Discovering Links Using Semantic Relatedness}},
booktitle = {INEX 2009 Workshop Preproceedings, 2009, Brisbane, Australia},
year = {2009},
pages = {314--325}
}
AIDA is a framework and online tool for entity detection and disambiguation. Given a natural-language text, for example news articles, it maps mentions of ambiguous names onto canonical entities (e.g., individual people or places) registered in the YAGO knowledge base. A description and demo is available on the AIDA website. AIDA is now part of the AmbiverseNLU suite.
YAGO is a huge semantic knowledge base, derived from Wikipedia, WordNet and GeoNames. Currently, YAGO has knowledge of more than 10 million entities (like persons, organizations, cities, etc.) and contains more than 120 million facts about these entities. All the data as well as several interfaces to browse and query the data are available on the YAGO website.
STICS is an entity-centric search engine that makes use of AIDA and YAGO. By extending the Google slogan of “things, not strings” to support also entity categories, STICS provides powerful functionality for querying and analyzing news and other text corpora in terms of entities, semantic classes, and text phrases. You can search, for example, for presidents of the United States and the JFK airport, and see how STICS distinguishes between JFK and JFK.
If you haven’t heard of Berlin Buzzwords, it’s the conference covering all the latest buzzwords surrounding the biggest buzzword of them all: Big Data. And boy is it buzzwordy! I’m …
I just read the latest blog post on the Heidelberg Laureate Forum, and I want to say that I very much share the feelings of Adrian Dudek, who wrote it. …
When coding for a long time in a black-on-white editor, I always get problems with unfocused and sleepy eyes. Going white-on-black helps keeping my eyes focused, however reading the code …