Understanding and Tackling Social Problems Like COVID-19 Through Networks and AI
In this interview with Dr. Mayank Kejriwal a Professor at USC, we’re given an overview of Dr. Kejriwal’s work on AI and network science for tackling problems like COVID-19. We’ve included a short transcription of the interview, but you can listen to the full podcast here.
Jason: […] As a real thought leader in technology, give me an example of where you see some new technology being used in the right way and where you see some technology that’s giving you pause saying this could go down the wrong direction very quickly.
Dr. Kejriwal: I would say the technology that is being used in a right way would be, for example, is when the White House issued a directive where they released 29,000 academic papers to start on the coronavirus […]
Be aware that this virus is just one member in a family — The Coronaviruses are a family and this is a particular strain, COVID-19, coronavirus disease-19. But this virus itself is not novel. There have been more than 29,000 papers till date and there probably will be a lot more during this crisis. But even before this, there were 29,000 papers.
So what the White House did was it released all these papers as a cohesive data set, and it started working with technology providers, with Microsoft and with some other big companies and even smaller companies. They actually made the 29,000 papers available and asked AI scientists, including groups like ours, through the directors to try and apply our technology through these 29,000 papers. Maybe to build a knowledge graph or recommendations. Or predictions — all kinds of whatever AI or machine learning that we feel can be applied to this data set to help doctors help other people come up with a cure or vaccine faster. Also provide supporting technology as they do the medical science to help them better come to grips with the 29,000 papers. I mean, no one has read all those papers, right? So it’s humanly impossible. So there may be things that they are looking for, but so many papers makes it very difficult to find.
That’s one place where we really have been making efforts on this. We have been processing the 29,000 papers in our group and collecting large social media data. We have a lot of news data – tens of thousands of news articles that have come to us since the early part of this year. So there’s a lot of social science and social media news and media analysis that we are doing. We are also trying to build a knowledge graph from the papers. So this is work that we are doing within the institute and within the university which, I think is a great thing because it goes to show that we were not hampered by bureaucracy.
We’re now in an era where, because of the supporting technologies like cloud computing and Kaggle, the dataset was released and is now even open for a competition, I believe on, on Kaggle. Because of all these platforms and all these websites and the ease with which we can share data and so on, it has become possible to start building stuff and applying stuff to these kinds of data sets at short notice.
The other thing which is interesting is even beyond these datasets and in some of the big companies and the press, like the New York times and journal companies or published journals like Springer and Sapio and so on… They have made the coronavirus papers free of charge.
For me, that doesn’t make a big difference because within the university system, we always have subscriptions, so we can access all of this for free. But outside of the university system, or if you don’t have a very expensive subscription, you have to purchase these articles and usually they are quite expensive. It could be $20 or $30 for articles in many cases, which means that it’s really not feasible at scale. Maybe if you downloaded an article every six months it might be feasible, but if you’re trying to download and study hundreds of articles, it becomes economically infeasible. So these companies have voluntarily decided to make a lot of coronavirus press coverage and journal articles and so on free. This also enables the broader public to participate in a way — It’s not just something for universities or are people who are elitists essentially. I sort of maybe fall in that class, but, it’s something that is democratic for that reason.
The other day I was also invited, I haven’t accepted yet but I might, to participate in a hackathon that is being organized — A virtual hackathon with a platform that makes a lot of resources and data sets available and they’re inviting developers and all kinds of other people who want to apply technology, as you log into the platform to sign up for the virtual hackathon. Then they donate some of their time and expertise to build things up.
So this has been a great outcome of the technology to show us how we can really apply the things that we have been developing and working on to datasets. Now the effectiveness remains to be seen, so we have to see maybe a year from now or six months from now just how effective we are. Sure, we use the data sets and we publish a lot of people’s papers. We may get some press, but how effective will it be? Like will we be able to contribute something useful through our technology? That we want to know right now but, you know, maybe six months later, one year later, we will see some very compelling success stories of how AI and machine learning played a big role in that. So I’m keeping my fingers crossed and hopefully I will get to play some kind of role in that […] Listen to the full podcast here.