ESWC2015 was held in Slovenia – Portoroz which means that you can actually dive into the Adriatic sea and get some sun.
Let’s begin with the stats of ESWC2015— following are some statistics:
- 43 submissions (34 research and 9 in-use) selected out 164 submissions
- 23% acceptance rate for research and 47% for in-use papers
- 12 PhD symposium contributions
- ~ 4.75 reviews per article
- 16 workshops
- 9 tutorials
- 1 hackfest
- 4 challenges
- The best research paper: Olaf Hartig
- The best in-use paper:Vania Dimitrova
I don’t have the numbers on attendance but it seemed something about 300 people.
So, what was I doing at the conference?
Claudia Damato, and philippe Cudre-Mauroux managed the PhD symposium track with a very constructive organisation ensuring the real mentoring to students who participated. I presented in this track. Let’s not forget to mention the name Bijan Parsia, my mentor, who helped me in preparing the final version of the paper and commented on my presentation — I appreciate it.
About my presentation. It was my first official appearance (presentation) in the SemWeb community by presenting my Doctoral Consortium Paper in the PhD symposium track . Overall, I think my presentation went over well. I figured out some practicalities while presenting which I need to take into account in the future. Symposium gave me the chance to test my time-management and stress-management skills which I think I handled it quite well. Not many senior researchers participated in the PhD symposium track due to the overlaps in the schedule — Workshops were scheduled on the very same day — ( Maybe that was a better choice, if we want to be fair here).
I got feedbacks from Marta Sabou and Enrico Motta and another PhD student which I can summarise them in this way:
– my research is useful for improving the scholarly communication and reproducibility of in the wet-lab research, however it can be challenging because:
– my research is interdisciplinary which means it is greatly dependent on the close collaboration with food chemists. [ As a joke, Philipe mentioned that: for making appointment with researchers, you need to act very fast, starting from yesterday].
– the gap between my research objectives and the existing reality is big — i need to focus, and narrow it down. It might not be feasible to address them all. The focus should be more on the formalisation of the methods in the wet-lab.
My research problem is indeed a socio-technical problem. I remember Jeremey Frey quoted the following line from Peter Murray Rust, when I met him in Southampton,
“you need to fight for your Ontologies, developing ontologies in “Physical sciences” is not easy. Chemists don’t want ontologies. They’ll sue you.” 🙂.
But for everything there is a first time. If we are to have a linked information space of publication, we should look into other research domains and try to apply our solutions. Moreover, for this, we should familiarise other sciences with the notions of RDF, ontology, linked data and so on and so on.
It was motivating that some fellow researchers from Madrid university, particularly, from Oscar Corcho’s group, had heard about my project and were telling me that their group is working on the similar issues. I will meet him in sssw2015 in the up-coming July.
I also found the work by Patrick Philipp from AIFB Institute very practical and relevant in the domain of health care. Medical domain is growing and its growth is marked with increased digitalisation of patient records, the use of electronic devices as supporting tools in patient care, and the use of sensors (for example, monitoring devices and surgery recording devices). So the result is the abundance of medical data. There are challenges, such as data format heterogeneity, distribution of the data sets, interoperability issues and basic pre-processing. He is trying to address these problems by introducing concrete architecture that support data consolidation and integration based on Linked Data principles.
From The Netherlands, in particular from the VU, two DC papers were accepted, apart from me, Anca Dumitrache also presented her work on Crowdsourcing. Moreover, Hamid Bazoubandi participated in the semantic Data management and Big Data2 track, and Laurence Rietveld participated in Mobile Sensors, Services & Web of Things (Wouter Beek presented). I enjoyed both of the talks.
I have not summarised every sessions. However, here, I give a summary of a few bits that attracted my attention from ESWC2015.
- I missed this, but I talked about it with Dan Brickley. The talks and discussions during this workshop were around pretty pragmatic issues. In particular how we can obtain real decentralisation, problems with centralised DNS and internet infrastructure, the lack of attention paid in this community to security issues, and the importance of understanding social processes and current practice for ensuring the web continues to function and that we don’t “break it by accident” (Henry Thompson).
This is nice to hear. Semantic web academics who are pioneers of a truly linked information space should be encouraged to think about the effects of their work on society, in particular, underprivileged and minorities.
I missed this. But I heard that the developers workshop was great, full of people positive about building tools and applications, and finding ways to make the power of linked data accessible to actual end users. The motivating thing is that the focus was on building for web developers rather than on end-user applications.
- Here’s the program, with links to projects and repos. Also the final session was recorded.
The main conference
I went to the following tracks:
Crowdsourcing and web science
- Revanthy Krishnamurthy presented about using general background knowledge and the contents of tweets to detect the location of twitter users, as most twitter users don’t have geolocation enabled when posting. They did Nice experiments like correlating mentions of events, landmarks and slang terms with physical places, but I have not heard how they are going to address the privacy of people who actively do not use geolocations.
- Seyi Feyisetan from Southampton discussed different factors that affected the performance of crowd workers, by looking at features of the tasks themselves rather than the platform or rewards, when asking workers to classify entities in tweets. The discussion were mostly came from social perspective. “How much you pay the workers for each annotation task?” and “how do you evaluate your result in this way?”
Natural Language processing
- Achim Rettinger presented a research on learning cross-lingual semantic representations of relations from textual data. This is mainly useful for tasks like cross-lingual information retrieval and question answering. They presented an approach to include semantic relations(not only entities) that are expressed in the text. They actually learned a cross-lingual lexicon of relation expressions from English and Spanish Wikipedia articles.
Reasoning is not really something that I know much about, but i tried:
- Large scale rule based reasoning using a Laptop; Matin Peters showed their research approach on developing a reasoner implementation that is able to apply the RDFS rules to a dataset with more than 1 billion triples using only a single laptop. they introduced new concept to store the working memories to the hard disk without the need to hold all triples in memory.
- I went to the presentation by Olaf Hartig who won the best paper award, which was a very fundamental theoretical work on the needed requirements to use the SPARQL over the web.
- The last paper of this session was from Raghavan Mutharaju, on distributed and scalable OWL 2 EL reasoning. They investigated the problem for the existing reasoners that are not able to classify traffic data and other large ontologies. They introduced a open source distributed reasoner that handles ontologies that are generated from streaming data.
I do not know if the above paragraphs make sense, but this is what I got.
In-use and industry
I enjoyed this track. I was thinking that my project can be a good fit for the in-use track.
- Dhavalkumar Thakker1(&) , Vania Dimitrova did a research on the Decision Support System (DSS) in tunnelling domain. Their objective was to address the complex problem for identify pathologies based on disorders present in various tunnel portions and understand the contextual factor that affecting a tunnel. For this they used semantic technologies in a DSS system. They developed and used ontologies to capture tacit knowledge from tunnel experts ( how about that??). Tunnel inspection data are annotated with ontologies to use inferring capabilities enabled by semantic technologies. they also developed a mechanism to exploit abstraction and inference capabilities to identify pathologies. Their system now is in use, PADTUN, and it is applied in a tunnel diagnosis use case with Société Nationale des Chemins de Fer Français (SNCF), France.
- Yolanda Gil ( Felix Michel presented from their team), did a research on supporting open collaboration in science through explicit and linked semantic description of processes. very good and relevant to me! Most of the scientific collaborations are being done through emails, phones, and drives but scientists can benefit from collaborative online platforms (e.g., blogs, wikis, forums, code sharing, and so on). Their objective was to develop a collaborative infrastructure for scientists to work on complex science questions that require multi-disciplinary contributions to gather and analyse data, that cannot occur without significant coordination to synthesise findings, and that grow organically (scientists come and go throughout the time) they defined a task-centered framework of the collaboration, includes principles from social sciences for successful online communities of practice (What I investigated in my Master thesis for e-health SME collaboration) and exposes an open science process. They also implemented the semantic wiki platform which can capture formal representations of tasks, relations between tasks and its users, and other properties of tasks, data, and other relevant research objects.
Cognition & Web Science
Jacobo Rouces presented his paper on representing N-aray Relations using Semantic Frame. So from what I understood they developed FrameBase, a broad coverage schema that can homogeneously integrate other KBs and has strong connection to natural language. It can represent and query n-aray relations from other knowledge bases, having different levels of granularities.
Linked data and Data management
Robert Meusel from Manheim university paper was on developing heuristics for fixing common errors in deployed schema.org microdata. They have identified the most common mistakes made by providers of schema.org Microdata. Beside more obvious mistakes as spellings errors within:
- types or property names,
- Confusion of datatype properties and object properties
- property domain and range violations
They have identified various confusions within the use of values of ObjectProperties and DatatypeProperties, and the violation of domain and range constraints defined for schema.org. They statistically compared these issues to similar analysis on Linked Open Data. the result was that Microdata is cleaner than LOD. They developed some heuristics that can be applied by data consumers to fix a large fraction of wrong markup in a post-processing step.
This can be very useful since more and more websites embed structured data describing for instance products, people, organizations, places, events, resumes, and cooking recipes into their HTML pages using markup formats such as RDFa, Microdata and Microformats.
The demos and poster session I thought was particularly lively.
I liked the idea by Sarven Capadisli’s Linked Research project. This idea is very much in line with what I think can be a solution for laboratory notebooks. Encouraging lab scientists to publish their methodology descriptions using the native web stack, using RDF to make it queryable, discoverable, on a more detailed granular level – in other words, doing this actually means advocating what- semweb community-preaches.
Entity annotation isn’t my main interest, however, i liked the idea behind GERBIL, a tool for evaluating entity annotators. This easy to use online tool lets you compare some of the different annotators available against different types of datasets, to see which would perform the best for your particular use case, without you needing to access any of the test datasets yourself (because you often should pay for licenses). Currently, their platform provides results for 9 annotators and 11 datasets with more coming. Internally, GERBIL is based on the Natural Language Programming Interchange Format (NIF) and provide Java classes for implementing APIs for datasets and annotators to NIF.
The Closing Ceremony
Those of you who were fowling the ESWC on the twitter, probably know by now that the general chair Fabian Gandon, challenged himself with performing a one-minute-madness covering all the themes from the conference. Yes he can! You may see the slides here.
Conferences are great places to do face-to-face discussions. I had nice chats with quite a few people, in particular, with Bahar Sateli on semantic publishing, the quality of scientific output, and how should we thinking about using semweb devs for wider ecosystem of publications.
Finally, Slovenia is beautiful, it has warm and polite people, cool ancient caves to visit (seriously everyone should go and visit Adelsberger Grotte or Postojna Jama), delicious sea food, the clean blue Adriatic sea, and sun and sun and so on.
I made lots of pictures and videos which i need to mix and master as soon as i find time.
Thanks to my supervisors Prof. Jan Top and Prof. Guus Schreiber for making it happen.