http://www.infoq.com/presentations/Value-Values
This talk was presented at the GOTO Copenhagen conference, May 2012. This a fantastic presentation, showing once again that Hickey is amongst the top thinkers in software today. And despite tearing into traditional databases, he doesn't mention his own solution of Datomic, a persistent storage solution implementing many of the ideas here. My notes follow, with time-stamps should you want semi-random-access. There's also a good HN discussion available.
- I.T. stands for information technology
- Information: the facts
- Inform: to convey knowledge via facts; to give shape to the mind; "facts" is the key word.
- Information: the facts
- What is a fact?
- A place where information is stored; operations on facts like get and set; operations control how facts can change; to convey a fact convey a location.
- NO! This is all wrong!
- ~4:00 Place: "a particular portion of space", "an area used for a particular purpose" - as coders, we know these as memory and disk/storage.
- ~4:45 'Information' Systems
- In memory: mutable objects are abstractions of places; objects have methods; but facts do not!
- In durable storage: tables/documents/records are places; DBs have update
- 6:30 PLOP: PLace-Oriented Programming
- New information replaces old - that's what most of us are coding.
- Born of limitations of early computers: small RAM and disks.
- Those limitations are long gone: we've got millions of times more space.
- 9:45 The Efficiencies of Place
- It's OK to use PLOP when "birthing" new values; birthing == prior to perceptibility, i.e. prior to becoming a fact. Once it becomes a fact, stop doing PLOP.
- But: this is an implementation detail - that's what representing things in places is.
- Remember, we want "information technology", not "technology technology."
- 11:55 Memory and Records
- We've co-opted these words, and believe our own mythos.
- Mental memory is associative and open.
- "Your friend's phone number is not a place (in your brain/mind)."
- Real records are enduring, and accreting; not erase and overwrite. No one rubbed out paper records to update information, they got another piece of paper.
- 14:10 The Point
- Values have many advantages: in process; across processes; in storage.
- We know these things already (by how uncomfortable the initial facts slide definition made us).
- Place has no role in an information model!
- 15:45 Value
- Definitions
- Relative worth
- A particular magnitude, number, or amount, e.g. "42"
- Precise meaning or significance - this is the unifying concept!
- Definitions
- 17:00 Is a String a Value?
- Is it immutable?
- Equality, comparability are basis for logic
- Who wants to go back to mutable Strings?
- C strings mutable; in Java, immutable.
- 19:00 Programming Values
- Immutable.
- Don't need methods: I can send you values without code and you are fine, i.e. you can parse and use those values.
- Are semantically transparent.
- Can be abstracted, e.g. composites and collections of values still immutable and semantically transparent.
- Benefits of Values
- 20:50 Values Can be Shared
- Share freely: aliases are free.
- No one can mess you up, nor you them.
- Big benefit of functional programming (FP), all values freely shareable.
- Incremental change is cheap.
- vs. Places: defensive copy, clone, locks - all hamper sharing.
- 22:40 Reproducible Results
- Operations on values are stable.
- Testing much simpler.
- Debugging: you can reproduce failures w/o replicating (global)
state.
- Hickey gives example of debugging with via email - just send the value, can then reproduce the error; no need to tediously duplicate state.
- Places: must establish matching "state" first - not fun.
- 24:00 Easy to Fabricate
- Anything can create compliant values (any language): for test,
simulation, etc.
- vs. getting your program to a particular state via interacting objects, ugh!
- Places: must emulate operational interface.
- Anything can create compliant values (any language): for test,
simulation, etc.
- 25:30 Thwart Imperativeness
- Values refuse to help you program imperatively; and that's a feature!
- Imperative code is inherently complex.
- Places: encourage/force and require imperativeness.
- 26:30 Language Independence
- Pure values are language independent: thus, the polyglot tool.
- Places: defined by language constructs (methods); you're stuck, don't have a definition independent of your language; can be proxied, remoted - but with much effort that doesn't add value.
- 27:45 Values are Generic
- Representations in any language.
- Few fundamental abstractions; for aggregation (lists, maps,
sets).
- "Fewer than 20 of these things (values)."
- "How many people can build a complex system w/ just 20 Java classes?"
- Places: operational interface is specific; generates a ton more
code; reuse is the big lie in OOP - it has poor reuse, and we
know it.
- Everything new needs another class. e.g. If you and I each have a Person class, with same field names, what can we do with those two things? NOTHING! Semantically identical, same names, same fields, still nothing.
- 29:50 Values Are the Best Interface
- For subsystems: values can be readily moved, or ported, or
enqueued.
- Data-driven interfaces => easy to move, portability, just passing data; easy to stick a queue as needed in the middle of that.
- Places: application, language, and flow likely all coupled; major architecture limitations esp. when you need to scale from 1 machine to n machines.
- We already program with data (REST, JSON, etc.) when we program in the large, we should stop doing PLOP in the small.
- For subsystems: values can be readily moved, or ported, or
enqueued.
- 33:10 Values Aggregate
- Values aggregate to values, e.g. 5 values -> list of values, is still a value; all the characteristics are great and stay great.
- So all benefits accrue to compositions of values.
- Places don't aggregate:
- Combinations of places, what properties does the composite have? "None! You have to start from zero again…Nothing composes with places."
- Need new operational interface for aggregate.
- 20:50 Values Can be Shared
- 34:35 Extended Value Propositions (moving beyond single-process)
- Mechanism for conveyance and perception
- Mechanism for memory
- Reduced coordination
- Location flexibility
- Essential for decision-making - and that's our job in IT
- Extended Value Propositions
- 35:10 Conveyance
- In the small
- Aliases of values convey value - we're done: simple, very cheap, worry-free
- Mutable things on queues convey nothing; reference to thing that can change, conveying places is extremely difficult, "you have to turn it into a value essentially."
- In the large
- Values rule on the wire: HTTP, distributed programming, we don't send mutable objects across the wire - early attempts to do so were "utter, complete, total" failures.
- No reproducible values in the PLOP DBs: e.g. sending you a primary key to you tells you…nothing! You'll have to do a lookup to get facts that may have changed.
- In the small
- 37:25 Perception (flip side of conveyance)
- In the small
- Values: to reach is to perceive; if I can reach your value, I can see it and use it.
- Places: How to perceive a coherent value of (a mutable)
object with multiple getters? You can't!
- You need a "recipe" for doing it - the copying, cloning, or locking recipe. Constantly re-invented. And those recipes don't aggregate/compose.
- In the large
- Values still rule on the wire, e.g. we don't "chat" w/ the operational interface to download a page piecemeal, we want the value of the page.
- No reproducible values in PLOP DBs - or they require a transaction to do it.
- In the small
- 40:15 Memory
- In the small
- Values: remembering == aliasing, nothing to it.
- Places: copy, if you can.
- In the large
- What if there were no permalinks?
- First, it was all static pages. But once dynamic sites came online, permalinks became essential.
- Place-oriented DBs - DIY time: you end up keeping timestamped records yourself and you'll need a now query.
- What if there were no permalinks?
- In the small
- 42:30 Reduced Coordination
- In the small
- Values: No locks! "No such thing as contention for values."
- Places: Lock policies don't aggregate.
- In the large
- No read transactions! (for values)
- PLOP: Often gotten wrong, to read consistently you have to "hold up the world" to do a read - problem of coordination, architecture, scaling, etc.
- In the small
- 43:53: Location Flexibility
- In the small
- Values: aliasing means only one copy.
- Places: master copy is special, need coordination.
- In the large
- Cache, e.g. HTTP caching - declare a value as stable, thus "you don't need to come to me every time you need it."
- CDN etc. - why do we have it for webpages but not databases?
- Data-based interface is movable - don't care where you are, what language, etc..
- So we "get" values for communications protocols, but not for databases.
- In the small
- Decision-making - covered in the next few slides below.
- 35:10 Conveyance
- 45:25 The Big Point: Facts are Values
- Facts are not places!
- Don't facts change? NO - they incorporate time.
- A new email address for a friend doesn't change the fact of what his old email address was.
- "This goes to the very core of what a fact is." Fact: "an event or
thing known to have happened or existed"
- From Latin: factum - 'something done' or something that happened.
- "Information is based on facts, and fact doesn't mean the most recent fact."
- 46:42 Facts != Recent Facts
- Knowledge is derived from facts
- Comparing
- Combining
- Especially, we compare/combine from different time points.
- "Imagine if you only knew the present value of everything; what kind of decision-making power would you have? It's dramatically reduced!" He mentions the movie Memento, where amnesia dogs the main character constantly. We make decisions by "delta" with the past.
- You cannot update a fact, any more than you can change the past.
- Knowledge is derived from facts
- 47:53: Information Systems (revisited)
- Are fundamentally about facts
- Should be completely about maintaining, manipulating facts.
- To give users leverage over facts
- Making decisions
- Systems should be value-oriented, not place-oriented
- Don't use process constructs for information
- "Their [objects] use for information is an idea bereft of merit. There is not one good component of using mutable objects for information; it's just wrong!"
- Are fundamentally about facts
- 48:52 Decision Making
- We know what it takes to support our own decision-making (hint: information)
- Compare present to past
- Spot trends, rates
- Aggregates
- Often requires time
- 49:37 Programmer I.T. (the tools we coders created for ourselves)
- Source Control
- Update in place? No!
- Timestamps - of course!
- Logs
- Update in place? No!
- Timestamps - of course!
- Source Control
- 51:47 Big Data
- "It's my contention that most or a very large portion of Big Data
is this:"
- Business to programmers: "I like your database (the logs) better than the one you gave me."
- Logs have all the information
- And timestamps
- We are reactive here ("It's quite embarrassing.")
- Mining logs, seriously? flat-files?! They're easy to append to, but not good data structures.
- Not delivering leverage - and co-mingled with a bunch of crap.
- "You should all look into the deep reasons why this is happening: we built better information systems for ourselves (e.g. source control) than we built for our customers."
- "It's my contention that most or a very large portion of Big Data
is this:"
- 54:28 The Space Age (what we coders are moving into)
- Space
- 'The unlimited expanse in which all things are located, and all events occur.'
- Encompasses both space and time, which are closely connected.
- Virtual memory, and garbage collector, were early tools of Space Age.
- If new never fails…you are effectively running in space, not in a place.
- If S3 never fills up, or you can just get another hard drive…it is not the cloud, but space.
- "What does this mean? It means we ARE building information systems, those systems maintain facts, and that new facts call for new space. This is the end for PLOP."
- Space
- 56:11 New Facts, New Space
- The end of PLOP
- If you can afford this, why do anything else?
- You can afford this
- there will be garbage, GC w/ storage
- 57:10 Summary
- We continue to use place-oriented programming languages and databases
- and make new ones!
- "This is the saddest thing." It's not about SQL vs. NoSQL
- long after rationale is gone
- and make new ones!
- We are missing out on the value of values
- which we recognize and cannot deny, in our own IT services for ourselves
- We need to deliver information systems
- demand is clear (Big Data, we need to know everything, why is it only in the logs?), resources available
- We continue to use place-oriented programming languages and databases
- "Facts do not cease to exist because they are ignored." - Aldous Huxley