|By David Weinberger||
|April 28, 2009 09:30 PM EDT||
Stephen Wolfram is giving at talk at Harvard about his WolframAlpha site, which will launch in May. Aim: “Find a way to make computable the systematic knowledge we’ve accumulated.”
The two big projects he’s worked on have made this possible. Mathematica (he’s worked on it for 23 yrs) makes it possible to do complex math and symbolic language manipulation. A New Kind of Science (NKS) has made it possible that it’s possible to understand much about the world computationally, often with very simple rules. So, WA uses NKS principles and the Mathematica engine. He says he’s in this project for the long term.
NOTE: Live-blogging.Posted without re-reading
Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.
You type in a question and you get back in answers. You can type in math and get back plots, etc. Type in “dgp france” and get back the answer, a graph of the history of the shows histogram of GDP.
“GDP of france / italy.”
“internet users in europe” shows histogram, list of highest and lowers, etc.
“Weather in Lexington, MA” “Weather lexington,ma 11/17/92″ “Weather lexington, MA moscow” shows comparison of weather and location.
“5 miles/sec” returns useful conversions and comparisons.
“$17/hr” converts to per week, per month, etc., plus conversion to other currencies.
“4000 words” gives a list of typical typing speeds, the length in characters, etc.
“333 gm gold” gives the mass, the commodity price, the heat capacity, etc.
“H2S04″ gives an illustration of the molecule, as well as the expected info about mass, etc.
“Caffeine mol wt/ water” gives a result of moelcular weights divided.
“decane 2 atm 50 C” shows what decane is like at two atmosphers and at 50 C, e.g., phase, density, boiling point, etc.
“LDL 180″: Where your cholesterol level is against the rest of the population.
“life expctancy male age 40 italy”: distribution of survival curve, history of that life expectancy over time. Add “1933″ and adds specificity.
“5′8″ 160 lbs”: Where in the distribution of body mass index
“ATTGTATACTAA”: Where that sequence matches the human genome
“MSFT”: Real time Microsoft quote and other financial performance info. “MSFT sun” assumes that “sun” refers to stock info about Sun Microsystems. [how?]
“ARM 20 yr mortgage”: payment of monthly tables, etc. Let’s you input the loan amount.
“D# minor”: Musical notation, plays the D# minor scale
“red + yellow”: Color swatch, html notation
“www.apple.com”: Info about Apple, history of page views
“lawyers”: Number employed, average wage
“France fish production”: How many metric tons produced, pounds per second which is 1/5 the rate trash is produced in NYC
“france fish production vs. poland”: charts and diagrams
“2 c orange juice”: nutritional info
“2 c orange juice + 1 slice cheddar cheese”: nutritional label
“a__a__n”: English words that match
“alan turing kurt godel”: Table of info about them
“weather princeton, nuy when kurt godel died”: the answer
“uncle’s uncle’s grandson’s grandson”: family tree, probabiilty of those two sharing genetic material
“5th largest country in europe”
“gdp vs. railway length in europe”:
“hurricane andrew”: Data, map
“andrew”: Popularity of the name, diagrammed.
“president of brazil in 1922″
“tide NYC 11/5/2015″
“ten flips 4 heads”: probability
“3,7,15,31,63…”: Figures out and plots next in the sequence and possible generating function
“4,1 knot”: diagram of knot
“next total solar eclipse chicago”: Next one visible in Chicago
“ISS”: International Space Station info and map
It lets you select alternatives in case of ambiguities.
“We’re trying to computer things.” We have tools that let us find things. But when you have a particular question, it’s unlikely that you’ll find that specific answer written down. WA therefore tries to compute answers. “The objective is to reach expert level knowledge across a very wide range of domains.”
Four big pieces to WA:
1. Data curation. WA has trillions of people of curated data. It gets it from free data or licensed data. Partially human partially automated system cleans it up and tries to correlate it. “A lot can be done automatically…At some point, you need a human domain expert in the middle of it.” There are people inside the company and a network of others who do the curation.
2. The algorithms. Take equations, etc., from all over. “There are finite numbers of methods that have been discovered in the history of science.” There are 5-6 millions lines of Mathematica code at work.
3. Linguistic analysis to understand the inputs. “There’s no manual, no documentation. You get to interact it with just how you think about things.” They’re doing the opposite of natural language processing which usually tries to understand millions of pages. WA’s problem is mapping a relatively small set of short human inputs to what the system knows about. NKS helps with this. It turns out that ambiguity is not nearly as big a problem as we thought.
4. Automated presentation. What do yo show people so they can cognitively grasp it? “Algorithmic presentation technology … tries to pick out what is important.” Mathematica has worked on “computational aesthetics” for years.
He says that have at least a reasonable start on about 90% of the shelves in a typical reference library.
Q: (andy orem) What do you do about the inconsistencies of data? We don’t know how inconsistent it was and what algorithms you used.
A: We give source info. “We’re trying to create an authoritative source for data.” We know about ranges of values; we’ll make that information available. “But by the time you have a lot of footnotes on a number, there’s not a lot you can do with that number.” “We do try to give footnotes.”
Q: How do you keep current?
A: Lots of people want to make their data available. We hope to make a streamlined, formalized way for people to contribute the data. We want to curate it so we can stand by it.
Q: [me] Openness? Of API, of metadata, of contributions of interesting comparisons, etc.
A: We’ll do a variety of levels of API. First: presentation level: put output on their pages. Second, XML-level so people can mash it up. Third level: individual results from the databases and from the computations. [He shows a first draft of the api] You can get as the symbolic expressions that Mathematica is based on. We hope to have a personalizable version. Metadata: When we open up our data repository mechanisms so people can contribute, some of our ontology will be exposed.
How about in areas where people disagree? If a new universe model comes out from Stanford, does someone at WolframAlpha have to say yes and put it in?
Q: How many people?
A: It’s been 150 for a long time. Now it’s 250. It’s probably going to be a thousand people.
Q: Who is this for?
A: It’s for expert knowledge for anyone who needs it.
Q: Business model?
A: The site will be free. Corporate sponsors will put ads on the side. We’re trying to figure out how to ingest vendor info when it’s relevant, and how to present it on the site. There will also be a professional version for people who are doing a lot of computation, want to put in their own data…
Q: Can you define the medical and population databases to get the total mass of people in England.
A: We could integrate those databases, but we don’t have that now. We’re working on “splat pages” you get when it doesn’t work. It should tell you what it does know.
Q: What happens when there is no answer, e.g., 55th largest state in the US?
A: It says it doesn’t know.
Q: [eszter] For some data, there are agreed-upon sources. For some there aren’t. How do you choose sources?
A: That’s a key problem in doing data curation. “How do we do it? We try to do the best job we can.” Use experts. Assess. Compare. [This is a bigger issue than Wolfram apparently thinks where data models are political. E.g., Eszter Hargittai, who is sitting next to me, points out "How many Internet users are there?" is a highly controversial question.] We give info about what our sources are.
Q: Technologically, where do you want to focus in the future?
A: All 4 areas need to be pushed forward.
Q: How does this compare to the Semantic Web?
A: Had the Web already had been semantically tagged, this product would have been far far easier, although keep in mind that much of the data in WA comes from private databases. We have a sophisticated ontology. We didn’t create the ontology top-down. It’s mostly bottom-up. We have domains. We have ontologies for them. We merge them together. “I hope as we expose some of our data repository methods, it will make it easier to do some Semantic Web kind of things. People will be able to line data up.”
Q: When can we look at the formal specifications of these ontologies? When can we inject our own?
A: It’s all represented in clean Mathematica code. Knitting new knowledge into the system is tricky because our UI is natural language, which is messy. E.g., “There’s a chap who goes by the name Fifty Cent.” You have to be careful.
Q: What reference source tells you if Palestine exists…?
A: In cases like this, we say “Assuming Case A or B.” There are holes in the data. I’m hoping people will be motivated to fill them in. Then there’s the question of the extent to which we can build expert communities. We don’t know the best way to do this. Lots of interesting ideas.
How about pop culture?
A: Pop culture info is much shallower computationally. (”Britney Spears” just gets her name, birthdate, and birthplace. No music, no photos, nothing about her genre, etc.) (”Meaning of life” does answer “42″)
Q: Compare with CYC? (A common sense reasoning system)
A: CYC deals with human reasoning. That’s not the best method for figuring out physics, etc. “We can do the non-human parts of reasoning really well.”
Q: [couldn't hear the question]
A: The best way to debug it is not necessarily to inspect the code but to inspect the results. People reading code is less efficient than automated systems.
Q: Will it be integrated into Mathematica?
A: A future version will let you type WA data into Mathematica.
Q: How much work do you have to do on the NLP sound? Your searches used a special lexicon…
A: We don’t know. We have a daily splat call to see what types of queries have failed. We’re pretty good at removing linguistic fluff. People drop the fluff pretty quickly after they’ve been using WA for a while.
Q: (free software foundation) How does this change the landscape for open access? There’s info in commercial journals…
A: When there’s a proprietary database, the challenge is making the right deals. People will not be able to take out of our system all the data that we put into it. We have yet to learn all of the issues that will come up.
A: We’re dealing with public data. We could do people search, but, personally, I don’t want to.
Q: What would you think of a more Wikipedia-like model? Do you worry about a competitor making a wiki data that is completely open and grows faster?
A: That’d be great. Making WA is hard. It’s not just a matter of shoveling data in. Wikipedia is fantastic and I use it all the time, but it’s gone in particular directions. When you’re looking for systematic data there, even if people put in systematic data — e.g., 300 pages about chemicals — over the course of time, the data gets dirty. You can’t compute from it.
Q: How about if Google starts presenting your results in response to queries?
A: We’re looking for synergies But we’re generating these on the fly; it won’t get indexed.
Q: I wonder how universities will find a place for this.
A: Very interesting question. Generating hard data is hard and useful, although universities often prefer higher levels of synthesis and opinion. [Loose paraphrase!] Leibniz had this nailed: Take any human argument and find a way to mechanically compute it.
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus inter...
May. 1, 2016 12:00 AM EDT Reads: 1,022
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Apr. 30, 2016 10:45 PM EDT Reads: 645
So, you bought into the current machine learning craze and went on to collect millions/billions of records from this promising new data source. Now, what do you do with them? Too often, the abundance of data quickly turns into an abundance of problems. How do you extract that "magic essence" from your data without falling into the common pitfalls? In her session at @ThingsExpo, Natalia Ponomareva, Software Engineer at Google, will provide tips on how to be successful in large scale machine lear...
Apr. 30, 2016 10:00 PM EDT Reads: 1,072
In his session at @ThingsExpo, Chris Klein, CEO and Co-founder of Rachio, will discuss next generation communities that are using IoT to create more sustainable, intelligent communities. One example is Sterling Ranch, a 10,000 home development that – with the help of Siemens – will integrate IoT technology into the community to provide residents with energy and water savings as well as intelligent security. Everything from stop lights to sprinkler systems to building infrastructures will run ef...
Apr. 30, 2016 05:30 PM EDT Reads: 703
Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...
Apr. 30, 2016 04:45 PM EDT Reads: 1,050
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, will discuss how research has demonstrated the value of Machine Learning in delivering next generation analytics to im...
Apr. 30, 2016 04:15 PM EDT Reads: 1,665
There is an ever-growing explosion of new devices that are connected to the Internet using “cloud” solutions. This rapid growth is creating a massive new demand for efficient access to data. And it’s not just about connecting to that data anymore. This new demand is bringing new issues and challenges and it is important for companies to scale for the coming growth. And with that scaling comes the need for greater security, gathering and data analysis, storage, connectivity and, of course, the...
Apr. 30, 2016 03:15 PM EDT Reads: 751
This is not a small hotel event. It is also not a big vendor party where politicians and entertainers are more important than real content. This is Cloud Expo, the world's longest-running conference and exhibition focused on Cloud Computing and all that it entails. If you want serious presentations and valuable insight about Cloud Computing for three straight days, then register now for Cloud Expo.
Apr. 30, 2016 02:30 PM EDT Reads: 1,706
IoT device adoption is growing at staggering rates, and with it comes opportunity for developers to meet consumer demand for an ever more connected world. Wireless communication is the key part of the encompassing components of any IoT device. Wireless connectivity enhances the device utility at the expense of ease of use and deployment challenges. Since connectivity is fundamental for IoT device development, engineers must understand how to overcome the hurdles inherent in incorporating multipl...
Apr. 30, 2016 02:15 PM EDT Reads: 1,451
The increasing popularity of the Internet of Things necessitates that our physical and cognitive relationship with wearable technology will change rapidly in the near future. This advent means logging has become a thing of the past. Before, it was on us to track our own data, but now that data is automatically available. What does this mean for mHealth and the "connected" body? In her session at @ThingsExpo, Lisa Calkins, CEO and co-founder of Amadeus Consulting, will discuss the impact of wea...
Apr. 30, 2016 01:15 PM EDT Reads: 659
SYS-CON Events announced today that Stratoscale, the software company developing the next generation data center operating system, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Stratoscale is revolutionizing the data center with a zero-to-cloud-in-minutes solution. With Stratoscale’s hardware-agnostic, Software Defined Data Center (SDDC) solution to store everything, run anything and scale everywhere...
Apr. 30, 2016 01:15 PM EDT Reads: 1,528
Angular 2 is a complete re-write of the popular framework AngularJS. Programming in Angular 2 is greatly simplified – now it's a component-based well-performing framework. This immersive one-day workshop at 18th Cloud Expo, led by Yakov Fain, a Java Champion and a co-founder of the IT consultancy Farata Systems and the product company SuranceBay, will provide you with everything you wanted to know about Angular 2.
Apr. 30, 2016 12:45 PM EDT Reads: 1,696
SYS-CON Events announced today that Men & Mice, the leading global provider of DNS, DHCP and IP address management overlay solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. The Men & Mice Suite overlay solution is already known for its powerful application in heterogeneous operating environments, enabling enterprises to scale without fuss. Building on a solid range of diverse platform support,...
Apr. 30, 2016 12:00 PM EDT Reads: 2,291
You deployed your app with the Bluemix PaaS and it's gaining some serious traction, so it's time to make some tweaks. Did you design your application in a way that it can scale in the cloud? Were you even thinking about the cloud when you built the app? If not, chances are your app is going to break. Check out this webcast to learn various techniques for designing applications that will scale successfully in Bluemix, for the confidence you need to take your apps to the next level and beyond.
Apr. 30, 2016 11:30 AM EDT Reads: 1,465
We’ve worked with dozens of early adopters across numerous industries and will debunk common misperceptions, which starts with understanding that many of the connected products we’ll use over the next 5 years are already products, they’re just not yet connected. With an IoT product, time-in-market provides much more essential feedback than ever before. Innovation comes from what you do with the data that the connected product provides in order to enhance the customer experience and optimize busi...
Apr. 30, 2016 11:15 AM EDT Reads: 886
Increasing IoT connectivity is forcing enterprises to find elegant solutions to organize and visualize all incoming data from these connected devices with re-configurable dashboard widgets to effectively allow rapid decision-making for everything from immediate actions in tactical situations to strategic analysis and reporting. In his session at 18th Cloud Expo, Shikhir Singh, Senior Developer Relations Manager at Sencha, will discuss how to create HTML5 dashboards that interact with IoT devic...
Apr. 30, 2016 11:00 AM EDT Reads: 915
Artificial Intelligence has the potential to massively disrupt IoT. In his session at 18th Cloud Expo, AJ Abdallat, CEO of Beyond AI, will discuss what the five main drivers are in Artificial Intelligence that could shape the future of the Internet of Things. AJ Abdallat is CEO of Beyond AI. He has over 20 years of management experience in the fields of artificial intelligence, sensors, instruments, devices and software for telecommunications, life sciences, environmental monitoring, process...
Apr. 30, 2016 11:00 AM EDT Reads: 886
SYS-CON Events announced today that Ericsson has been named “Gold Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. Ericsson is a world leader in the rapidly changing environment of communications technology – providing equipment, software and services to enable transformation through mobility. Some 40 percent of global mobile traffic runs through networks we have supplied. More than 1 billion subscribers around the world re...
Apr. 30, 2016 11:00 AM EDT Reads: 871
Digital payments using wearable devices such as smart watches, fitness trackers, and payment wristbands are an increasing area of focus for industry participants, and consumer acceptance from early trials and deployments has encouraged some of the biggest names in technology and banking to continue their push to drive growth in this nascent market. Wearable payment systems may utilize near field communication (NFC), radio frequency identification (RFID), or quick response (QR) codes and barcodes...
Apr. 30, 2016 10:00 AM EDT Reads: 710
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry's single source for the cloud. Fusion's advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including cloud...
Apr. 30, 2016 09:45 AM EDT Reads: 2,554