How to do things with AI & law research

How to do things with AI & law research Dutch Legal Tech Meetup, 2015-11-02 Anna Ronkainen Chief Scientist, TrademarkNow Inc. @ronkaine

A tale of two origin stories

‘Preliminary try-outs of decision machines built according to various formal specifications can be made in relation to selected administrative or judicial tribunals. The Supreme Court might be chosen for the purpose.’ (Harold Lasswell 1955)

‘Can we “feed” into the computer that the judge’s ulcer is getting worse, that he had fought earlier in the morning with his wife, that the coffee was cold, that the defence counsel is an apparent moron, that the temporarily assigned associate judge is unfamiliar with the law and besides smokes obnoxious cigars, that the tailor’s bill was outrageous etc. etc.?’ (Kaarle Makkonen 1968, translation ar)

”As we know, there are known knowns. There are things we know we know. We also know there are known unknowns, that is to say, we know there are some things we do not know. But there are also unknown unknowns, the ones we don’t know we don’t know.” – Donald Rumsfeld (2002)

(Un)known (un)knowns

knownunknowns

knownknowns

unknownunknowns

??


knownunknowns

knownknowns

unknownunknowns

unknownknowns


consciousignorance

consciousknowledge

unconsciousignorance

unconsciousknowledge

Dual-process cognition System 1 •  evolutionarily old •  unconscious, preconscious •  shared with animals •  implicit knowledge •  automatic •  fast •  parallel •  high capacity •  intuitive •  contextualized •  pragmatic •  associative •  independent of general

intelligence

System 2 •  evolutionarily recent •  conscious •  distinctively human •  explicit knowledge •  controlled •  slow •  sequential •  low capacity •  reflective •  abstract •  logical •  rule-based •  linked to general intelligence

(Frankish&Evans2009)

Systems 1 and 2 in legal reasoning: interaction System 1: making the decision System 2: validation and justification

(Ronkainen2011)

What’s that got to do with legal AI? -  MOSONG, my 1st (and so far only) system

prototype -  built for studying the use of fuzzy logic in

modelling various issues in legal theory -  specifically, the use of Type-2 fuzzy logic for

modelling vagueness and uncertainty -  trademarks initially just a random example

domain -  but the knowledge acquired through this

research also proved useful for TrademarkNow...

Open texture ‘Whichever device, precedent or legislation, is chosen for the communication of standards of behaviour, these, however smoothly they work over the great mass of ordinary cases, will, at some point where their application is in question, prove indeterminate; they will have what has been termed an open texture.’ -  (Hart 1961)

Standard example of open texture : No vehicles in a park ‘When we are bold enough to frame some general rule of conduct (e.g. a rule that no vehicle may be taken into the park), the language used in this context fixes necessary conditions which anything must satisfy if it is to be within its scope, and certain clear examples of what is certainly within its scope may be present to our minds.’ (Hart 1961) ... but that’s really a stupid example because vehicles are already categorized in excruciating detail so being more precise costs nothing

Inescapable open texture: No boozing in a park (but “civilized” drinking is okay) Section 4

Intake of intoxicating substances

The intake of intoxicating substances is prohibited in public places in built-up areas [...].

The provisions of paragraph 1 do not concern [...] the intake of alcoholic beverages in a park or in a comparable public place in a manner such that the intake or the presence associated with it does not obstruct or unreasonably encumber other persons’ right to use the place for its intended purpose.

(Finland: Public Order Act (612/2003))

Inescapable open texture: Trademark similarity (Mosong) Article 8 Relative grounds for refusal 1. Upon opposition by the proprietor of an earlier trade mark, the trade mark applied for shall not be registered: (a) if it is identical with the earlier trade mark and the goods or services for which registration is applied for are identical with the goods or services for which the earlier trade mark is protected; (b) if because of its identity with or similarity to the earlier trade mark and the identity or similarity of the goods or services covered by the trade marks there exists a likelihood of confusion on the part of the public in the territory in which the earlier trade mark is protected; the likelihood of confusion includes the likelihood of association with the earlier trade mark. [...] (CTM Regulation (40/94/EC))

Mosong: the domain Tentative rule Article 8 Relative grounds for refusal 1. Upon opposition by the proprietor of an earlier trade mark, the trade mark applied for shall not be registered: (a) if it is identical with the earlier trade mark and the goods or services for which registration is applied for are identical with the goods or services for which the earlier trade mark is protected; (b) if because of its identity with or similarity to the earlier trade mark and the identity or similarity of the goods or services covered by the trade marks there exists a likelihood of confusion on the part of the public in the territory in which the earlier trade mark is protected; the likelihood of confusion includes the likelihood of association with the earlier trade mark. REFUSAL = MARKS-SIMILAR and GOODS-SIMILAR

‘Training’ set: 119 cases

“Training set” 119 OHIM cases from 1997–2000, of which 107 from the Opposition Division (1st instance) and 12 from the Boards of Appeal (2nd instance)

Results for the training set

0

0.2

0.4

0.6

0.8

1

Validation set 30 most recent (2002) relevant cases from OHIM: 20 from the Opposition Division and 10 from the Boards of Appeal Result*: all cases predicted correctly * when coded into the system by a domain expert

Results for the validation set

0

0.2

0.4

0.6

0.8

1

Non-expert validation •  done by non-law students taking a course on •  intellectual property law (n=75) •  original validation set in two parts (15+15 cases) •  at the beginning and the end of the course •  completed non-interactively through a web form •  correct answer: 54.6±6.5% •  incorrect answer: 25.9±7.5% •  no answer: 19.5±5.2% (± = σ)

Non-expert validation

% ±stderr before after total

group 1 (n=15) 41.3±1.7 65.8±2.8 53.5±1.7

group 2 (n=12) 46.1±2.0 65.0±3.0 55.6±1.9

group 3 (n=48) 43.3±1.3 65.9±1.3 54.7±0.9

total (n=75) 43.4±1.0 65.8±1.1 54.6±0.8

Initial conclusions from this work -  it (sort of) works; using fuzzy logic makes

sense in this context -  poses more questions than it answers... -  ...and that’s how I ended up trying to

reverse-engineer human lawyers rather than just trying to build systems based on existing legal theory literature

Implications for legal AI -  using rule-based methods has its advantages -  human-readable -  comparatively quick to develop -  modifiable (esp. relevant wrt legislative

changes) -  but they can’t do the work alone -  can’t make sense about situations which they

weren’t specifically built to handle -  real-world complexity needs (sometimes)

statistical/machine-learning approaches

That second origin story...

It started with a frustrated trademark attorney... -  Mikael Kolehmainen – now TMnow CEO – worked as a

trademark attorney at one of the biggest boutique TM firms in Finland

-  he was deeply frustrated with the existing way of doing trademark searches and wanted to do something about it

-  first he found Matti Kokkola (CTO) through a common acquintance

-  then me (Chief Scientist) via the university, totally by accident

-  and finally Heikki Vesalainen (Chief Architect) who had co-founded a company with Matti as sophomores

-  development work started spring 2012, seed funding from Lifeline Ventures in August 2012

-  first release (then as Onomatics Quick Search) Oct 2012

So, what came out of it?

About TrademarkNow -  trademark legal technology provider

founded in 2012, based in Helsinki, NYC and Kilkenny, now ~30 employees

-  total funding so far ~3MEUR (plus gov’t grants and loans (Tekes))

-  products based on a unique model of likelihood of confusion for trademarks

- NameCheck: intelligent TM search - NameWatch: intelligent TM watch

Trademark search: Ye olde way -  TM lawyer formulates search strategy (wildcards

& classes to be used etc.) -  paralegal carries out the search and hands the

results to the lawyer -  lawyer browses through the results and marks

the ones for which more info needed -  paralegal retrieves additional info -  lawyer reviews said info, evaluates risk, reports -  whole process typically takes several days,

stakeholders often expect results immediately

Trademark search: The new way

Trademark search: The new way

Not just trademark search

Core component: AI model of likelihood of confusion (TM similarity) -  similarity of trademarks -  phonetical -  graphic -  semantic -  currently only word marks

-  similarity of goods and services -  others do this only using the 45 classes of

the Nice Classification

Mix-and-match approach to AI techniques -  traditionally AI has been divided into to

factions: rule-based and statistical -  all our competitors are also either-or -  both have their advantages and

disadvantages -  unlike the mainstream, we’re flexible and

use both as we see fit, to maximize their benefits

Customers ♥ TrademarkNow

Customers ♥ TrademarkNow

Lessons learned & stuff

Research commercialization is difficult in general – not only for AI & law -  innovation and commercialization are tossed

around as vital research policy goals a lot these days pretty much wherever you go

-  said tossers* tend to treat it as a black box, basically thinking that telling academics to be innovative is all it takes

-  there are two parts in the equation, and only one of them can be said to be the academics’ responsibility

* sorry, couldn’t resist

Why research commercialization fails -  most such ventures fail for a simple reason: putting the

cart before the horse -  solution looking for a problem, not the other way

around -  academics (typically) don’t have a very commercially

oriented mindset -  perhaps most importantly, product design and

management are often left out of the equation altogether

-  basic research is a fairly blunt instrument: research end-product (good enough for publication) very different from a marketable and commercially viable product

The first part of the equation: What academics can do about it -  consider potential uses even when planning

and carrying out basic research -  and of course there’s also applied research:

for legal tech, a lot of general AI/NLP stuff just waiting to be (tried out to see if it can be) used (cf. e-discovery)

-  try to take an active role in seeking out potential partners for commercialization (no time for that, I know...)

Applied and basic research: Pasteur’s quadrant

Quest for fundamental

understanding? ye

s

Pure basic research (Bohr)

Use-inspired basic research

(Pasteur)

no

- Pure applied

research (Edison)

no yes

Considerations of use?

(Stokes 1997)

The other part of the equation: The people with the actual problems -  you are more likely to end up with a viable

product when you start with a problem and use research to look for a solution, not the other way around

-  the initiative should come from someone who has experienced the pain points first hand – or at least people who can see an inefficiency, have an idea about what to do about it, and can figure out how to fill in the blanks

Een kans uitzonderlijk voor Nederland -  veel geld geïnvesteerd in rechtsinformatisch

wetenschappelijk onderzoek in de jaren 1980 -  de gerichte investering nu weg maar de sporen

zijn nog duidelijk te zien in het AI & law-milieu (bijv. JURIX)

-  meerdere succesvolle projecten in samenwerking met de overheid

-  maar (bijna?) geen startups met oorsprong in het onderzoeksmilieu

-  dus potentieel een enorme bron van expertise voor nieuwe juridische startups te gebruiken

Dank U!

Law

How to do things with AI & law research