|By David Weinberger||
|April 28, 2009 09:30 PM EDT||
Stephen Wolfram is giving at talk at Harvard about his WolframAlpha site, which will launch in May. Aim: “Find a way to make computable the systematic knowledge we’ve accumulated.”
The two big projects he’s worked on have made this possible. Mathematica (he’s worked on it for 23 yrs) makes it possible to do complex math and symbolic language manipulation. A New Kind of Science (NKS) has made it possible that it’s possible to understand much about the world computationally, often with very simple rules. So, WA uses NKS principles and the Mathematica engine. He says he’s in this project for the long term.
NOTE: Live-blogging.Posted without re-reading
Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.
You type in a question and you get back in answers. You can type in math and get back plots, etc. Type in “dgp france” and get back the answer, a graph of the history of the shows histogram of GDP.
“GDP of france / italy.”
“internet users in europe” shows histogram, list of highest and lowers, etc.
“Weather in Lexington, MA” “Weather lexington,ma 11/17/92″ “Weather lexington, MA moscow” shows comparison of weather and location.
“5 miles/sec” returns useful conversions and comparisons.
“$17/hr” converts to per week, per month, etc., plus conversion to other currencies.
“4000 words” gives a list of typical typing speeds, the length in characters, etc.
“333 gm gold” gives the mass, the commodity price, the heat capacity, etc.
“H2S04″ gives an illustration of the molecule, as well as the expected info about mass, etc.
“Caffeine mol wt/ water” gives a result of moelcular weights divided.
“decane 2 atm 50 C” shows what decane is like at two atmosphers and at 50 C, e.g., phase, density, boiling point, etc.
“LDL 180″: Where your cholesterol level is against the rest of the population.
“life expctancy male age 40 italy”: distribution of survival curve, history of that life expectancy over time. Add “1933″ and adds specificity.
“5′8″ 160 lbs”: Where in the distribution of body mass index
“ATTGTATACTAA”: Where that sequence matches the human genome
“MSFT”: Real time Microsoft quote and other financial performance info. “MSFT sun” assumes that “sun” refers to stock info about Sun Microsystems. [how?]
“ARM 20 yr mortgage”: payment of monthly tables, etc. Let’s you input the loan amount.
“D# minor”: Musical notation, plays the D# minor scale
“red + yellow”: Color swatch, html notation
“www.apple.com”: Info about Apple, history of page views
“lawyers”: Number employed, average wage
“France fish production”: How many metric tons produced, pounds per second which is 1/5 the rate trash is produced in NYC
“france fish production vs. poland”: charts and diagrams
“2 c orange juice”: nutritional info
“2 c orange juice + 1 slice cheddar cheese”: nutritional label
“a__a__n”: English words that match
“alan turing kurt godel”: Table of info about them
“weather princeton, nuy when kurt godel died”: the answer
“uncle’s uncle’s grandson’s grandson”: family tree, probabiilty of those two sharing genetic material
“5th largest country in europe”
“gdp vs. railway length in europe”:
“hurricane andrew”: Data, map
“andrew”: Popularity of the name, diagrammed.
“president of brazil in 1922″
“tide NYC 11/5/2015″
“ten flips 4 heads”: probability
“3,7,15,31,63…”: Figures out and plots next in the sequence and possible generating function
“4,1 knot”: diagram of knot
“next total solar eclipse chicago”: Next one visible in Chicago
“ISS”: International Space Station info and map
It lets you select alternatives in case of ambiguities.
“We’re trying to computer things.” We have tools that let us find things. But when you have a particular question, it’s unlikely that you’ll find that specific answer written down. WA therefore tries to compute answers. “The objective is to reach expert level knowledge across a very wide range of domains.”
Four big pieces to WA:
1. Data curation. WA has trillions of people of curated data. It gets it from free data or licensed data. Partially human partially automated system cleans it up and tries to correlate it. “A lot can be done automatically…At some point, you need a human domain expert in the middle of it.” There are people inside the company and a network of others who do the curation.
2. The algorithms. Take equations, etc., from all over. “There are finite numbers of methods that have been discovered in the history of science.” There are 5-6 millions lines of Mathematica code at work.
3. Linguistic analysis to understand the inputs. “There’s no manual, no documentation. You get to interact it with just how you think about things.” They’re doing the opposite of natural language processing which usually tries to understand millions of pages. WA’s problem is mapping a relatively small set of short human inputs to what the system knows about. NKS helps with this. It turns out that ambiguity is not nearly as big a problem as we thought.
4. Automated presentation. What do yo show people so they can cognitively grasp it? “Algorithmic presentation technology … tries to pick out what is important.” Mathematica has worked on “computational aesthetics” for years.
He says that have at least a reasonable start on about 90% of the shelves in a typical reference library.
Q: (andy orem) What do you do about the inconsistencies of data? We don’t know how inconsistent it was and what algorithms you used.
A: We give source info. “We’re trying to create an authoritative source for data.” We know about ranges of values; we’ll make that information available. “But by the time you have a lot of footnotes on a number, there’s not a lot you can do with that number.” “We do try to give footnotes.”
Q: How do you keep current?
A: Lots of people want to make their data available. We hope to make a streamlined, formalized way for people to contribute the data. We want to curate it so we can stand by it.
Q: [me] Openness? Of API, of metadata, of contributions of interesting comparisons, etc.
A: We’ll do a variety of levels of API. First: presentation level: put output on their pages. Second, XML-level so people can mash it up. Third level: individual results from the databases and from the computations. [He shows a first draft of the api] You can get as the symbolic expressions that Mathematica is based on. We hope to have a personalizable version. Metadata: When we open up our data repository mechanisms so people can contribute, some of our ontology will be exposed.
How about in areas where people disagree? If a new universe model comes out from Stanford, does someone at WolframAlpha have to say yes and put it in?
Q: How many people?
A: It’s been 150 for a long time. Now it’s 250. It’s probably going to be a thousand people.
Q: Who is this for?
A: It’s for expert knowledge for anyone who needs it.
Q: Business model?
A: The site will be free. Corporate sponsors will put ads on the side. We’re trying to figure out how to ingest vendor info when it’s relevant, and how to present it on the site. There will also be a professional version for people who are doing a lot of computation, want to put in their own data…
Q: Can you define the medical and population databases to get the total mass of people in England.
A: We could integrate those databases, but we don’t have that now. We’re working on “splat pages” you get when it doesn’t work. It should tell you what it does know.
Q: What happens when there is no answer, e.g., 55th largest state in the US?
A: It says it doesn’t know.
Q: [eszter] For some data, there are agreed-upon sources. For some there aren’t. How do you choose sources?
A: That’s a key problem in doing data curation. “How do we do it? We try to do the best job we can.” Use experts. Assess. Compare. [This is a bigger issue than Wolfram apparently thinks where data models are political. E.g., Eszter Hargittai, who is sitting next to me, points out "How many Internet users are there?" is a highly controversial question.] We give info about what our sources are.
Q: Technologically, where do you want to focus in the future?
A: All 4 areas need to be pushed forward.
Q: How does this compare to the Semantic Web?
A: Had the Web already had been semantically tagged, this product would have been far far easier, although keep in mind that much of the data in WA comes from private databases. We have a sophisticated ontology. We didn’t create the ontology top-down. It’s mostly bottom-up. We have domains. We have ontologies for them. We merge them together. “I hope as we expose some of our data repository methods, it will make it easier to do some Semantic Web kind of things. People will be able to line data up.”
Q: When can we look at the formal specifications of these ontologies? When can we inject our own?
A: It’s all represented in clean Mathematica code. Knitting new knowledge into the system is tricky because our UI is natural language, which is messy. E.g., “There’s a chap who goes by the name Fifty Cent.” You have to be careful.
Q: What reference source tells you if Palestine exists…?
A: In cases like this, we say “Assuming Case A or B.” There are holes in the data. I’m hoping people will be motivated to fill them in. Then there’s the question of the extent to which we can build expert communities. We don’t know the best way to do this. Lots of interesting ideas.
How about pop culture?
A: Pop culture info is much shallower computationally. (”Britney Spears” just gets her name, birthdate, and birthplace. No music, no photos, nothing about her genre, etc.) (”Meaning of life” does answer “42″)
Q: Compare with CYC? (A common sense reasoning system)
A: CYC deals with human reasoning. That’s not the best method for figuring out physics, etc. “We can do the non-human parts of reasoning really well.”
Q: [couldn't hear the question]
A: The best way to debug it is not necessarily to inspect the code but to inspect the results. People reading code is less efficient than automated systems.
Q: Will it be integrated into Mathematica?
A: A future version will let you type WA data into Mathematica.
Q: How much work do you have to do on the NLP sound? Your searches used a special lexicon…
A: We don’t know. We have a daily splat call to see what types of queries have failed. We’re pretty good at removing linguistic fluff. People drop the fluff pretty quickly after they’ve been using WA for a while.
Q: (free software foundation) How does this change the landscape for open access? There’s info in commercial journals…
A: When there’s a proprietary database, the challenge is making the right deals. People will not be able to take out of our system all the data that we put into it. We have yet to learn all of the issues that will come up.
A: We’re dealing with public data. We could do people search, but, personally, I don’t want to.
Q: What would you think of a more Wikipedia-like model? Do you worry about a competitor making a wiki data that is completely open and grows faster?
A: That’d be great. Making WA is hard. It’s not just a matter of shoveling data in. Wikipedia is fantastic and I use it all the time, but it’s gone in particular directions. When you’re looking for systematic data there, even if people put in systematic data — e.g., 300 pages about chemicals — over the course of time, the data gets dirty. You can’t compute from it.
Q: How about if Google starts presenting your results in response to queries?
A: We’re looking for synergies But we’re generating these on the fly; it won’t get indexed.
Q: I wonder how universities will find a place for this.
A: Very interesting question. Generating hard data is hard and useful, although universities often prefer higher levels of synthesis and opinion. [Loose paraphrase!] Leibniz had this nailed: Take any human argument and find a way to mechanically compute it.
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
Sep. 3, 2015 04:30 PM EDT Reads: 428
All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades. With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo, November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be.
Sep. 3, 2015 04:15 PM EDT
Containers are not new, but renewed commitments to performance, flexibility, and agility have propelled them to the top of the agenda today. By working without the need for virtualization and its overhead, containers are seen as the perfect way to deploy apps and services across multiple clouds. Containers can handle anything from file types to operating systems and services, including microservices. What are microservices? Unlike what the name implies, microservices are not necessarily small, but are focused on specific tasks. The ability for developers to deploy multiple containers – thous...
Sep. 3, 2015 04:00 PM EDT Reads: 138
The 3rd International WebRTC Summit, to be held Nov. 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 15th International Cloud Expo, 6th International Big Data Expo, 3rd International DevOps Summit and 2nd Internet of @ThingsExpo. WebRTC (Web-based Real-Time Communication) is an open source project supported by Google, Mozilla and Opera that aims to enable bro...
Sep. 3, 2015 03:00 PM EDT Reads: 1,604
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of streaming data in the cloud with an enterprise grade SLA. It features built-in integration with Azur...
Sep. 3, 2015 02:45 PM EDT Reads: 386
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Sep. 3, 2015 02:30 PM EDT Reads: 962
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and analyzed? As an area of investment, how might a retail company move towards an innovation methodolo...
Sep. 3, 2015 02:30 PM EDT Reads: 509
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on demos and comprehensive walkthroughs.
Sep. 3, 2015 02:15 PM EDT Reads: 413
Contrary to mainstream media attention, the multiple possibilities of how consumer IoT will transform our everyday lives aren’t the only angle of this headline-gaining trend. There’s a huge opportunity for “industrial IoT” and “Smart Cities” to impact the world in the same capacity – especially during critical situations. For example, a community water dam that needs to release water can leverage embedded critical communications logic to alert the appropriate individuals, on the right device, as soon as they are needed to take action.
Sep. 3, 2015 01:30 PM EDT
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, will introduce the technologies required for implementing these ideas and some early experiments performed in the Kurento open source software community in areas ...
Sep. 3, 2015 01:15 PM EDT Reads: 104
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevOps to advance innovation and increase agility. Specializing in designing, imple...
Sep. 3, 2015 01:00 PM EDT Reads: 367
Consumer IoT applications provide data about the user that just doesn’t exist in traditional PC or mobile web applications. This rich data, or “context,” enables the highly personalized consumer experiences that characterize many consumer IoT apps. This same data is also providing brands with unprecedented insight into how their connected products are being used, while, at the same time, powering highly targeted engagement and marketing opportunities. In his session at @ThingsExpo, Nathan Treloar, President and COO of Bebaio, will explore examples of brands transforming their businesses by t...
Sep. 3, 2015 12:30 PM EDT Reads: 285
In his session at @ThingsExpo, Lee Williams, a producer of the first smartphones and tablets, will talk about how he is now applying his experience in mobile technology to the design and development of the next generation of Environmental and Sustainability Services at ETwater. He will explain how M2M controllers work through wirelessly connected remote controls; and specifically delve into a retrofit option that reverse-engineers control codes of existing conventional controller systems so they don't have to be replaced and are instantly converted to become smart, connected devices.
Sep. 3, 2015 12:00 PM EDT Reads: 251
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts, GM of Platform at FinancialForce.com, will discuss the value of business applications on wearable ...
Sep. 3, 2015 10:45 AM EDT
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of technology leadership, Micron's memory solutions enable the world's most innovative computing, consumer,...
Sep. 3, 2015 10:00 AM EDT Reads: 282
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
Sep. 3, 2015 10:00 AM EDT Reads: 1,603
As more intelligent IoT applications shift into gear, they’re merging into the ever-increasing traffic flow of the Internet. It won’t be long before we experience bottlenecks, as IoT traffic peaks during rush hours. Organizations that are unprepared will find themselves by the side of the road unable to cross back into the fast lane. As billions of new devices begin to communicate and exchange data – will your infrastructure be scalable enough to handle this new interconnected world?
Sep. 3, 2015 09:30 AM EDT Reads: 202
While many app developers are comfortable building apps for the smartphone, there is a whole new world out there. In his session at @ThingsExpo, Narayan Sainaney, Co-founder and CTO of Mojio, will discuss how the business case for connected car apps is growing and, with open platform companies having already done the heavy lifting, there really is no barrier to entry.
Sep. 3, 2015 09:30 AM EDT Reads: 211
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
Sep. 3, 2015 09:30 AM EDT Reads: 651
Manufacturing connected IoT versions of traditional products requires more than multiple deep technology skills. It also requires a shift in mindset, to realize that connected, sensor-enabled “things” act more like services than what we usually think of as products. In his session at @ThingsExpo, David Friedman, CEO and co-founder of Ayla Networks, will discuss how when sensors start generating detailed real-world data about products and how they’re being used, smart manufacturers can use the data to create additional revenue streams, such as improved warranties or premium features. Or slash...
Sep. 3, 2015 09:15 AM EDT