Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Patent No. 7,979,907 Petition For Inter Partes Review
Paper No. 1
IN THE
UNITED STATES PATENT AND TRADEMARK OFFICE
BEFORE THE PATENT TRIAL AND APPEAL BOARD
_____________
SYMANTEC CORPORATION,
Petitioner - vs. -
THE TRUSTEES OF COLUMBIA UNIVERSITY
IN THE CITY OF NEW YORK,
Patent Owner _____________
Patent No. 7,979,907 Issued: July 12, 2011
Inventors: Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Manasi Bhattacharyya, and Salvatore J. Stolfo
Title: SYSTEMS AND METHODS FOR DETECTION OF NEW MALICIOUS EXECUTABLES
Inter Partes Review No.
PETITION FOR INTER PARTES REVIEW OF U.S. PATENT NO.
7,979,907 UNDER 35 U.S.C. §§ 311-319 AND 37 C.F.R. §§ 42.1-.80, 42.100-.123 _____________
Mail Stop Patent Board
Patent Trial and Appeal Board P.O. Box 1450
Alexandria, VA 22313-1450 December 5, 2014
Patent No. 7,979,907 Petition For Inter Partes Review
i
TABLE OF CONTENTS
Page
I. INTRODUCTION ...................................................................................................... 1
II. MANDATORY NOTICES (37 C.F.R. § 42.8(A)(1)) ............................................ 1
A. Real Party-In-Interest (37 C.F.R. § 42.8(b)(1)) ............................................. 1
B. Notice of Related Matters (37 C.F.R. § 42.8(b)(2)) ..................................... 1
C. Designation of Lead and Backup Counsel (37 C.F.R. § 42.8(b)(3)) ........................................................................................................ 2
D. Service of Information (37 C.F.R. § 42.8(b)(4)) ........................................... 2
III. GROUNDS FOR STANDING (37 C.F.R. § 42.104(A)) ..................................... 2
IV. IDENTIFICATION OF CHALLENGE (37 C.F.R. § 42.104(B)) ..................... 3
A. Effective Filing Date of the ’907 patent ........................................................ 3
B. There Is a Reasonable Likelihood That at Least One Claim of the ’907 Patent Is Unpatentable under 35 U.S.C. § 103. ............................ 3
V. OVERVIEW OF THE ’907 PATENT .................................................................... 5
VI. CONSTRUCTION OF THE CHALLENGED CLAIMS (37 C.F.R. § 42.104(B)(3)) ............................................................................................................... 6
VII. THE CHALLENGED CLAIMS ARE UNPATENTABLE ............................... 8
A. Classifying executable email attachments using byte sequence features is not new ............................................................................................ 8
1. U.S. Patent No. 5,832,208 (“Chen”) .................................................. 9
2. “A Constructive Induction Approach to Computer Immunology” (“Cardinale”) ..............................................................10
3. “Automatically Generated WIN32 Heuristic Virus Detection” (“Arnold”) .......................................................................12
Patent No. 7,979,907 Petition For Inter Partes Review
TABLE OF CONTENTS (Continued)
Page
ii
4. “Attacks on WIN32” (“Szor”) ..........................................................13
5. U.S. Patent No. 6,823,323 (“Forman”) ............................................13
6. “Boosting and Naïve Bayesian Learning” (“Elkan”) .....................14
7. Admitted Prior Art (“APA”) .............................................................14
B. Reasons the Claims are Unpatentable..........................................................14
1. Ground 1: Chen in View of Cardinale and further in view of Elkan Renders Obvious Claims 10, 11, and 15-17 Under 35 U.S.C. 103(a) .................................................................14
a. Claim 10: “A system for classifying an executable attachment in an email received at a computer system” ......................................................................................16
b. Claim 10: “one or more computer processors executing instructions” ...........................................................17
c. Claim 10: “a) an email filter configured to filter said executable attachment from said email” ......................17
d. Claim 10: “b) a feature extractor configured to extract a byte sequence feature from said executable attachment” ..........................................................18
e. Claim 10: “c) a rule evaluator configured to: classify said executable attachment by comparing said byte sequence feature of said executable attachment to a classification rule set derived from byte sequence features of a set of executables having a predetermined class in a set of classes” .................................................................................19
Patent No. 7,979,907 Petition For Inter Partes Review
TABLE OF CONTENTS (Continued)
Page
iii
f. Claim 10: “determine a probability that said executable attachment is a member of a class of said set of classes based on said byte sequence feature, and divide the determination of said probability into a plurality of processing steps and to execute said processing steps in parallel” ................21
g. Claim 11: “extract static properties of said executable attachment” ..........................................................23
h. Claim 15: “predict the classification of said executable attachment as one class of a set of classes consisting of malicious, benign, and borderline” ................................................................................24
i. Claim 16: “determine said probability that said executable attachment is a member of one class of said set of classes with a Naive Bayes algorithm” .................................................................................26
j. Claim 17: “determine said probability that said executable attachment is a member of a class of said set of classes with a multi-Naive Bayes algorithm” .................................................................................26
2. Ground 2: Chen in View of Cardinale and further in view of Elkan and APA Renders Obvious Claim 12 Under 35 U.S.C. 103(a) ......................................................................27
a. Claim 12: “convert said executable attachment from binary format to hexadecimal format” .......................28
3. Ground 3: Chen in View of Cardinale and further in view of Elkan and Szor Renders Obvious Claim 13 Under 35 U.S.C. 103(a) ......................................................................28
Patent No. 7,979,907 Petition For Inter Partes Review
TABLE OF CONTENTS (Continued)
Page
iv
a. Claim 13: “create a byte string representative of resources referenced by said executable attachment” ..............................................................................29
4. Ground 4: Chen in View of Cardinale and further in view of Elkan and Arnold Renders Obvious Claim 14 Under 35 U.S.C. 103(a) ......................................................................31
a. Claim 14: “predict the classification of said executable attachment as one class of a set of classes consisting of malicious and benign” ........................31
5. Ground 5: Chen in View of Cardinale and further in view of Elkan and Forman Renders Obvious Claims 1, 2, 6-9, and 18-20 Under 35 U.S.C. 103(a) .......................................33
a. Claim 1: “A method for classifying an executable attachment in an email received at a computer system” ......................................................................................34
b. Claim 1: “a) filtering said executable attachment from said email” ......................................................................34
c. Claim 1: “extracting a byte sequence feature from said executable attachment” ..................................................35
d. Claim 1: “c) classifying said executable attachment by comparing said byte sequence feature of said executable attachment with a classification rule set derived from byte sequence features of a set of executables having a predetermined class in a set of classes” ...............................35
Patent No. 7,979,907 Petition For Inter Partes Review
TABLE OF CONTENTS (Continued)
Page
v
e. Claim 1: “wherein said classifying comprises determining using a computer processor, with a Multi-Naive Bayes algorithm, a probability that said executable attachment is a member of each class in said set of classes based on said byte sequence feature and dividing said step of determining said probability into a plurality of processing steps and executing said processing steps in parallel” .......................................................................35
f. Claim 2: “extracting static properties of said executable attachment” ..........................................................37
g. Claim 6: “determining a probability that said executable attachment is a member of each class in a set of classes consisting of malicious, benign, and borderline” ........................................................................38
h. Claims 7 and 18: “classify[ing] said executable attachment as malicious if said probability that said executable attachment is malicious is greater than said probability that said executable attachment is benign” .............................................................38
i. Claims 8 and 19: “classify[ing] said executable attachment as benign if said probability that said executable attachment is benign is greater than said probability that said executable attachment is malicious” .................................................................................39
j. Claims 9 and 20: “classify[ing] said executable attachment as borderline if a difference between said probability that said executable attachment is benign and said probability that said executable attachment is malicious is within a predetermined threshold” .................................................................................40
Patent No. 7,979,907 Petition For Inter Partes Review
TABLE OF CONTENTS (Continued)
Page
vi
6. Ground 6: Chen in View of Cardinale and further in view of Elkan, Forman, and Arnold Renders Obvious Claim 5 Under 35 U.S.C. 103(a) ........................................................41
a. Claim 5: “determining a probability that said executable attachment is a member of each class in a set of classes consisting of malicious and benign” ......................................................................................42
7. Ground 7: Chen in View of Cardinale and further in view of Elkan, Forman, and APA Renders Obvious Claim 3 Under 35 U.S.C. 103(a) ........................................................42
8. Ground 8: Chen in View of Cardinale and further in view of Elkan, Forman, and Szor Renders Obvious Claim 4 Under 35 U.S.C. 103(a) ........................................................42
VIII. CONCLUSION..........................................................................................................43
Patent No. 7,979,907 Petition For Inter Partes Review
vii
EXHIBIT LIST (37 C.F.R. § 42.63(e))
Exhibit Description
1001 U.S. Patent No. 7,979,907 to Schultz et al.
1002 File History of U.S. Patent No. 7,979,907
1003 Declaration of Michael T. Goodrich, Ph.D.
1004 Curriculum vitae of Michael T. Goodrich, Ph.D.
1005 The Trustees of Columbia University in the City of New York v. Symantec Corp., Civil Action No. 3:13-cv-808, Oct. 7, 2014 Claim Construction Order (Dkt. No. 123)
1006 The Trustees of Columbia University in the City of New York v. Symantec Corp., Civil Action No. 3:13-cv-808, October 23, 2014 Memoran-dum Order Clarifying Claim Construction (Dkt. No. 146)
1007 U.S. Patent No. 5,832,208 to Chen et al.
1008 Cardinale, K. et al., “A Constructive Induction Approach to Computer Immunology,” published March 1999
1009 Arnold, W. et al., “Automatically Generated WIN32 Heuristic Vi-rus Detection,” Virus Bulletin Conference September 2000, pub-lished September 2000
1010 Szor, P. et al., “Attacks on WIN32,” Virus Bulletin Conference October 1998, published October 1998
1011 U.S. Patent No. 6,823,323 to Forman et al.
1012 Elkan, C., “Boosting and Naïve Bayesian Learning,” Technical Report No. CS97-557, published September 1997
Patent No. 7,979,907 Petition For Inter Partes Review
1
I. INTRODUCTION
In accordance with 35 U.S.C. §§ 311-319 and 37 C.F.R. §§ 42.1-.80 & 42.100-
.123, inter partes review is respectfully requested for claims 1-20 of United States Patent
No. 7,979,907 to Schultz et al., titled “Systems and Methods for Detection of New
Malicious Executables” (the “’907 patent”) owned by The Trustees of Columbia Uni-
versity in the city of New York (“Columbia”). (EXHIBIT 1001 (“Ex. 1001”).) This
petition demonstrates that there is a reasonable likelihood that the petitioners will
prevail on at least one of the claims challenged in the petition based on prior art refer-
ences that the United States Patent and Trademark Office (“USPTO”) did not have
before it during prosecution. Claims 1-20 of the ’907 patent should therefore be can-
celed as unpatentable.
II. MANDATORY NOTICES (37 C.F.R. § 42.8(A)(1))
A. Real Party-In-Interest (37 C.F.R. § 42.8(b)(1))
The real party-in-interest for this petition is Symantec Corporation (“Petition-
er” or “Symantec”).
B. Notice of Related Matters (37 C.F.R. § 42.8(b)(2))
The ’907 patent is presently the subject of the following patent infringement
lawsuit brought by Columbia in the Eastern District of Virginia, Richmond Division:
Civil Action No. 3:13-cv-808 against Symantec. Concurrent with the instant petition,
Patent No. 7,979,907 Petition For Inter Partes Review
2
Petitioner is also filing petitions requesting inter partes review of U.S. Patent Nos.:
7,487,544, 8,601,322, 8,074,115, 7,448,084, and 7,913,306.
C. Designation of Lead and Backup Counsel (37 C.F.R. § 42.8(b)(3))
Lead: David D. Schumann, Reg. No. 53,569. Email:
Backup: Brian M. Hoffman, Reg. No. 39,713. Email:
Address for both counsel: FENWICK & WEST LLP, 555 California Street,
12th Floor, San Francisco, CA 94104, Tel: (415) 875-2300, Fax: (415) 281-1350.
D. Service of Information (37 C.F.R. § 42.8(b)(4))
Service of any documents via hand-delivery may be made at the postal mailing
addresses of the respective lead and back-up counsel designated above with courtesy
copies to the email addresses [email protected] and
[email protected]. Petitioner consents to electronic service.
III. GROUNDS FOR STANDING (37 C.F.R. § 42.104(A))
Petitioner certifies pursuant to Rule 42.104(a) that the ’907 patent is available
for inter partes review and that Petitioner is not barred or estopped from requesting an
inter partes review challenging the validity of the above-referenced claims of the ’907
patent on the grounds identified in the petition.
Patent No. 7,979,907 Petition For Inter Partes Review
3
IV. IDENTIFICATION OF CHALLENGE (37 C.F.R. § 42.104(B))
A. Effective Filing Date of the ’907 patent
The ’907 patent issued from U.S. Application No. 12/338,479 filed on Decem-
ber 18, 2008. The ’479 Application is a continuation of U.S. Application 10/208,432,
filed July 30, 2002, now U.S. Patent No. 7,487,544, and which claims the benefit of
U.S. Provisional Application Nos. 60/308,622, filed July 30, 2001 and 60/308,623, al-
so filed July 30, 2001.
B. There Is a Reasonable Likelihood That at Least One Claim of the ’907 Patent Is Unpatentable under 35 U.S.C. § 103.
The challenged claims are generally directed to a system and method for detect-
ing malicious executables in email attachments. Prior art had disclosed the subject
matter of these claims. The claims are unpatentable in view of the following patents
and publications:
U.S. Patent No. 5,832,208, filed on September 5, 1996, issued on No-
vember 3, 1998, and titled “Anti-virus agent for use with databases and
mail servers” (“Chen”) (Ex. 1007). This patent is prior art to the ’907
patent under pre-AIA §§ 102(a) and (b).
Cardinale, K. et al., “A Constructive Induction Approach to Computer
Immunology,” published March 1999 (“Cardinale”) (Ex. 1008). This
publication is prior art to the ’907 patent under pre-AIA §§ 102 (a) and
Patent No. 7,979,907 Petition For Inter Partes Review
4
(b).
Arnold, W. et al., “Automatically Generated WIN32 Heuristic Virus De-
tection,” Virus Bulletin Conference, September 2000, published Sep-
tember 2000 (“Arnold”) (Ex. 1009). This publication is prior art to the
’907 patent under pre-AIA § 102(a).
Szor, P. et al., “Attacks on WIN32,” Virus Bulletin Conference, October
1998, published October 1998 (“Szor”) (Ex. 1010). This publication is
prior art to the ’907 patent under pre-AIA §§ 102(a) and (b).
U.S. Patent No. 6,823,323, filed on April 26, 2001, published on October
31, 2002, and titled “Automatic classification method and apparatus”
(“Forman”) (Ex. 1011). This patent is prior art to the ’907 patent under
pre-AIA § 102(e).
Elkan, C., “Boosting and Naïve Bayesian Learning,” Technical Report
No. CS97-557, published September 1997 (“Elkan”) (Ex. 1012). This
publication is prior art to the ’907 patent under pre-AIA §§ 102(a) and
(b).
Section VII below explains how the above-cited references create a reasonable
likelihood that the Petitioner will prevail on at least one of the challenged claims. See
35 U.S.C. § 314(a). Indeed, section VII, as supported by the Declaration of Michael
T. Goodrich, Ph.D. and the claim charts attached thereto (Ex. 1003), demonstrates
Patent No. 7,979,907 Petition For Inter Partes Review
5
that all of the challenged claims are rendered obvious in view of these references. Pe-
titioner requests cancellation of claims 1-20 as unpatentable under 35 U.S.C. § 103.
V. OVERVIEW OF THE ’907 PATENT
The ’907 patent discloses “[a] system and methods for detecting malicious exe-
cutable attachments at an email processing application of a computer system using da-
ta mining techniques.” (Abstract.)1 Figure 8 of the ‘907 patent is shown below.
Figure 8 illustrates “[t]he process of detecting malicious emails.” (12:65). The
1 All citations in this section are to the ’907 patent (Ex. 1001).
Patent No. 7,979,907 Petition For Inter Partes Review
6
process 100 begins when a server receives emails at step 102. (12:66-67). “[T]he
emails are filtered to extract attachments or other components from the email (step
104).” (13:6-7). The extracted attachments may be saved as a file. (13:7-8).
Next, at step 106, “features” in the executable attachment are extracted.
(13:19-20). These “features” include properties extracted from the attachment, such
as byte sequences of hexadecimal characters that represent the machine code in the
executable attachment. (See 13:19-44).
“The features extracted from the attachment in step 106 are evaluated using [a]
classification rule set . . . , and the attachment is classified as malicious or benign (step
108).” (13:45-48). The “classification rule set [is] derived from byte sequence features
of a data set of known executables having a predetermined class in a set of classes,
e.g., malicious or benign.” (Abstract). In addition to identifying an attachment as ma-
licious or benign, step 110 may also be included to identify executables that are bor-
derline (e.g., cannot be classified as either malicious or benign). (See 14:4-29). Finally,
at step 112, the analyzed attachment is logged along with other information such as
whether the attachment was malicious, benign, or borderline. (14:56-59).
VI. CONSTRUCTION OF THE CHALLENGED CLAIMS (37 C.F.R. § 42.104(B)(3))
The terms in claims 1-20 are to be given their broadest reasonable construction
(“BRC”), as understood by one of ordinary skill in the art and consistent with the dis-
closure. See 37 C.F.R. § 42.100(b); see also In re Yamamoto, 740 F.2d 1569, 1571 (Fed.
Patent No. 7,979,907 Petition For Inter Partes Review
7
Cir. 1984); In re Am. Acad. of Sci. Tech. Ctr., 367 F.3d 1359, 1363-64 (Fed. Cir. 2004).
The following constructions were adopted by the district court in The Trustees of
Columbia University in the City of New York v. Symantec Corp., Civil Action No. 3:13-cv-
808 for the ’907 patent. Petitioner submits that the claim terms should be construed
at least as broadly as the constructions the district court adopted for the reasons set
forth in that case. (Exs. 1005-1006).
Specifically, the BRC of the term “byte sequence feature” is a “[f]eature that is
a representation of machine code instructions of the executable” where a “‘[f]eature’ is
a property or attribute of data which may take on a set of values.” (Ex. 1005). This
construction is consistent with the specification, which states that a “byte sequence
feature is informative because it represents the machine code in an executable.” (’907
patent, 6:18-20, Ex. 1001; Goodrich Decl., ¶ 52, Ex. 1003). In addition, the specifica-
tion states that a feature is a “propert[y]” of the executable. (’907 patent, 3:36-40, Ex.
1001; Goodrich Decl., ¶ 52, Ex. 1003).
In addition to the construction adopted by the district court, Petitioner submits
the following constructions:
The BRC of the term “filtering” is “extracting.” (Goodrich Decl., ¶ 53, Ex.
1003). This construction is consistent with the specification, which states that “emails
are filtered to extract attachments or other components from the email.” (’907 patent,
13:6-7, Ex. 1001; Goodrich Decl., ¶ 53, Ex. 1003).
Patent No. 7,979,907 Petition For Inter Partes Review
8
The BRC of the term “classification rule set” is “a set of hypotheses that pre-
dict classification.” (Goodrich Decl., ¶ 54, Ex. 1003). This construction is consistent
with the specification, which states that “a classification rule set is considered to have
the standard meaning in data mining terminology, i.e., a set of hypotheses that predict
the classification.” (’907 patent, 12:3-7, Ex. 1001; Goodrich Decl., ¶ 54, Ex. 1003).
The BRC of the term “static properties” is “properties that do not require an
executable to be run in order to be discerned.” (Goodrich Decl., ¶ 55, Ex. 1003).
This construction is consistent with the specification, which states “extracting the byte
sequence feature from said executable attachment comprises extracting static proper-
ties of the executable attachment, which are properties that do not require the execut-
able to be run in order to discern.” (‘907 patent, 3:36-40, Ex. 1001; Goodrich Decl., ¶
55, Ex. 1003).
VII. THE CHALLENGED CLAIMS ARE UNPATENTABLE
A. Classifying executable email attachments using byte sequence fea-tures is not new
The Background section of the ’907 patent describes how the propagation of ma-
licious executables through e-mail attachments is a serious security risk. (’907 patent,
1:46-53, Ex. 1001). Therefore, virus scanner technology uses signature-based detec-
tors and heuristic classifiers to detect new viruses. (’907 patent, 1:54-56, Ex. 1001).
Manually generating heuristic classifiers is costly and, therefore, “finding an automatic
method to generate classifiers has been the subject of research in the anti-virus com-
Patent No. 7,979,907 Petition For Inter Partes Review
9
munity.” (’907 patent, 2:2-5, Ex. 1001). In fact, the research in the anti-virus com-
munity had already recognized that byte sequence features could be used to derive a
classification rule set for classifying an executable attachment. (See Cardinale, p. 26,
Ex. 1008). The elements recited by the challenged claims merely describe obvious
combinations in which the byte sequence features are used to classify executables.
1. U.S. Patent No. 5,832,208 (“Chen”)
Chen discloses an agent computer program that works in “conjunction with an-
ti-virus software to detect and remove computer viruses from email attachments.”
(Chen, 5:1-6, Ex. 1007). When an email is received by a mail server, the agent com-
puter program determines whether an attachment is present in the email message.
(Chen, 7:41-43, Ex. 1007). Chen discloses that an attachment may be an executable
program. (Chen, 3:21-22, Ex. 1007). If the email includes an attachment, the agent
computer program detaches the email attachment from the email message and sends
the attachment to an anti-virus application for virus scanning. (Chen, 7:48-51, Ex.
1007).
The anti-virus application scans the attachment for viruses. (Chen, 7:21-51, Ex.
1007). If the anti-virus application classifies the attachment as being infected with a
virus, the agent computer system transmits an alert to devices in a network and the
anti-virus application attempts to remove the virus from the attachment. (Chen, 7:56-
58 and 8:6-8, Ex. 1007). If the anti-virus application is able to remove the virus, the
Patent No. 7,979,907 Petition For Inter Partes Review
10
agent computer program reattaches the attachment to the original email and email is
handled like a normal email. (Chen, 5:25-27 and 8:8-9, Ex. 1007). Chen discloses that
the agent computer program can work in conjunction with any virus detection pro-
gram. (Chen, 6:29-32, Ex. 1007).
2. “A Constructive Induction Approach to Computer Immu-nology” (“Cardinale”)
Cardinale discloses a developed prototype named MERCURY that includes a
virus scanner that detects viruses in executable files. (Cardinale, p. 114, sec. 3.5.3.1.2
and p. 172, sec. 5.5.2, Ex. 1008). MERCURY employs a learning method called “in-
duction learning” to generate a set of detectors2 that can distinguish between self and
nonself files. (Cardinale, pp. 30-31, sec. 1.5 and pp. 49-51, sec. 2.4, Ex. 1008). Cardi-
nale discloses that nonself files are files that are infected with a virus and that the self
files are files that are not infected. (Cardinale, pp. 20-22, sec. 1.1, Ex. 1008).
A component of MERCURY, referred to as “HEC,” creates the detectors
which are for both self and nonself files. (Cardinale, p. 140, sec. 5.3.1, Ex. 1008).
To create the detectors, the HEC uses training samples that include viral and nonviral
examples. (Cardinale, p. 49, sec. 2.4.1; p. 51, sec. 2.4.1.1, Ex. 1008). From the train-
ing samples, the HEC creates hypotheses, which are candidate detectors. (Cardinale,
2 Cardinale also refers to the detectors as “byte patterns” and “signatures.” (Cardinale,
p. 30, sec. 1.4; p. 139, sec. 5.2; p. 140, sec. 5.3, p. 141, sec. 5.3.1.1, Ex. 1008).
Patent No. 7,979,907 Petition For Inter Partes Review
11
p. 139, sec. 5.2; p. 140, sec. 5.3, Ex. 1008). The HEC creates the hypotheses using
“two methods: initial selection of attributes from an example file, or construction
based upon the features of two existing hypotheses from the same concept.” (Cardi-
nale, p. 140, sec. 5.3.1, Ex. 1008). For selection of attributes from an example file, the
HEC uses the following three rules to select bytes from the example file: “chunking,
sliding window, and every other byte sliding window.” (Cardinale, p. 143, sec. 5.3.1.3,
Ex. 1008).
For each hypothesis, a score is calculated that indicates how well the hypothesis
classifies examples. (Cardinale, p. 141, sec. 5.3.1.1; p. 156, sec. 5.3.2, Ex. 1008). The
score is used to determine if a detector should be derived from the hypothesis and in-
cluded in a knowledge base that includes detectors “used to classify executable files as
self and nonself.” (Cardinale, pp. 156-157, sec. 5.3.2; p. 170, sec. 5.3; and p. 171, sec.
5.4, Ex. 1008). If the score is acceptable, a detector is derived from the hypothesis
and added to the knowledge base. (Cardinale, p. 156, sec. 5.3.2; p. 170, sec. 5.3; and p.
171, sec. 5.4, Ex. 1008).
To classify an executable file, 16 bytes at a time are extracted from the executa-
ble file. (Cardinale, p. 173, sec. 5.5.3, Ex. 1008). Each of the byte sequences is com-
pared to the detectors included in the knowledge base. (Id.). The file may be classi-
fied self, nonself, or indiscernible. (Cardinale, p. 127, sec. 4.6.2; pp. 173-175, 5.5.3.1-
5.5.3.3, Ex. 1008). An indiscernible file is sent to a virus expert to determine whether
Patent No. 7,979,907 Petition For Inter Partes Review
12
the file is infected. (Cardinale, p. 173-174, sec. 5.5.3.1, Ex. 1008). Once the results
are received from the expert, the HEC creates a new detector based on the result and
adds it to the knowledge base. (Cardinale, p. 115, sec. 3.5.3.1.3; p. 174, sec. 5.5.3.1,
Ex. 1008).
3. “Automatically Generated WIN32 Heuristic Virus Detec-tion” (“Arnold”)
Arnold discloses a heuristic classifier for detecting computer viruses. In partic-
ular, Arnold discloses a neural network classifier made up of eight linear networks,
where each network classifies a file and the outputs from the networks are combined
to determine whether file is infected with a virus. (Arnold, p. 6, Ex. 1009). Each
network is trained using n-grams identified using viral and clean training samples.
(Arnold, pp. 4-5, Ex. 1009). The n-grams are small sequences of bytes extracted from
files. (Arnold, p. 2, Ex. 1009).
Once the networks are trained, the classifier is tested using sample files. (Ar-
nold, p. 5, Ex. 1009). For a test sample file, an input vector is generated based on
which n-gram features the file includes. (Id.). Each network calculates an output O
for the input vector. (Id.). The output is a value between 0.0 and 1.0. (Id.). The out-
put value is then compared to a threshold. (Id.). If the output is above the threshold,
a discrete output of 1 is output by the network, indicating the file is infected. (Id.). If
the output is below the threshold, a discrete output of zero is output by the network,
indicating the file is not infected. (Id.). The discrete outputs of the networks are
Patent No. 7,979,907 Petition For Inter Partes Review
13
summed and the sum is compared to a threshold V. (Arnold, p. 6, Ex. 1009). If the
sum is greater than V, the output of the classifier is 1, indicating that the file has been
classified as infected. (Id.). If the sum is less than V, the output of the classifier is 0,
indicating that the file has been classified as uninfected. (Id.). The output value O
calculated by each network and the sum of the discrete outputs each represents a
probability of whether the file is malicious. (Id.).
4. “Attacks on WIN32” (“Szor”)
Szor discloses known attack methods used by viruses against the Win32 API
and the platforms that support it. (Szor, pp. 1-2, Ex. 1010). Szor discloses various
features which can be “useful to detect 32-bit Windows viruses heuristically.” (Szor,
p. 24, Ex. 1010). One such suspicious feature is found if the executable file references
certain resources found in the external KERNEL32.DLL file. (Szor, p. 26 Ex. 1010).
5. U.S. Patent No. 6,823,323 (“Forman”)
Forman discloses “classifying an instance (i.e., a data item or a record) automat-
ically into one or more classes [] from a set of potential classes.” (Forman, Abstract,
Ex. 1011). Forman discloses that “a system [] for classifying a new instance [] includes
a ballpark classifier [], which is generated … from a set of training records [] corre-
sponding to an entire set of potential classes into which [the] new instance [] may be
classified.” (Forman, 4:5-9, Ex. 1011). The ballpark classifier may be a Naive Bayes
classifier that assigns each of the potential classes a probability of the new instance be-
Patent No. 7,979,907 Petition For Inter Partes Review
14
longing to the class. (Forman, 4:38-44, Ex. 1011).
6. “Boosting and Naïve Bayesian Learning” (“Elkan”)
Elkan discloses “boosting applied to naïve Bayesian classifiers.” (Elkan, p. 1,
Abstract, Ex. 1012). The boosting idea is to learn a series of Naïve Bayesian classifi-
ers, “where each classifier in the series pays more attention to the examples misclassi-
fied by its predecessor” (Elkan, p. 5, sec. 4, Ex. 1012). Once the Naïve Bayesian clas-
sifiers are learned and input attributes are provided to be classified, each individual
Naïve Bayesian classifier produces an output. (Id.). A combined output H is deter-
mined by “applying a sigmoid function to a weighted sum of the outputs of the indi-
vidual classifiers.” (Id.). In addition, Elkan discloses that the Naïve Bayesian classifi-
ers operate on parallel computing units. (Elkan, p. 1, Abstract and p. 6, sec. 6, Ex.
1012).
7. Admitted Prior Art (“APA”)
The APA in the specification of the ‘907 patent discloses that “[h]exdump, as is
known in the art … is an open source tool that transforms binary files into hexadeci-
mal files.” (’907 patent, 6:7-12, Ex. 1001).
B. Reasons the Claims are Unpatentable
1. Ground 1: Chen in View of Cardinale and further in view of Elkan Renders Obvious Claims 10, 11, and 15-17 Under 35 U.S.C. 103(a)
Chen in view of Cardinale and further in view of Elkan teaches every element
Patent No. 7,979,907 Petition For Inter Partes Review
15
of claims 10, 11, and 15-17. Independent claim 10 generally recites four elements: 1) a
“filtering” element in which an email filter filters an executable attachment from an
email; 2) an “extracting” element in which a feature extractor extracts a byte sequence
feature from the executable attachment; 3) a “classifying” element in which a rule
evaluator classifies the executable attachment by comparing the extracted byte se-
quence feature to a classification rule set; and 4) a “determining” element in which a
probability is determined that the executable attachment is a member of a class. The
determination of the probability is divided into multiple steps executed in parallel.
Each of these elements is plainly taught by the combination of Chen, Cardinale, and
Elkan. Dependent claims 11 and 15-17 recite additional features related to these ele-
ments which are also disclosed by the combination of Chen, Cardinale, and Forman.
A person of ordinary skill in the art would find it obvious to modify Cardinale’s
virus detector to include the multiple Naïve Bayes classifiers disclosed by Elkan for
classifying executable files. Ex. 1003 at ¶ 98. Cardinale discloses that even though it
uses induction learning, other machine learning approaches could have been used for
classifying files and extracting signatures, such as neural networks and Bayesian meth-
ods. Cardinale, p. 22, sec. 1.1; pp. 83-85, sec. 3.4.1; and pp. 232-233, sec. 7.5.2, Ex.
1008. As is known to a person of ordinary skill in the art, a Naïve Bayes classifier is a
type of machine learning classifier and a Bayesian method. Goodrich Decl, ¶ 98, Ex.
1003. Therefore, modifying Cardinale’s virus detector to include Elkan’s multiple Na-
Patent No. 7,979,907 Petition For Inter Partes Review
16
ïve Bayes classifiers is nothing more than a simple substitution of one known element
for another to obtain predictable results. Id.
A person of ordinary skill in the art would find it obvious to combine Cardi-
nale’s virus detector as modified by Elkan with the agent computer program of Chen.
Goodrich Decl., ¶ 99, Ex. 1003. Chen discloses that its agent computer program can
be used with any virus detector. Chen, 6:29-32, Ex. 1007. Therefore, using Chen’s
agent computer program with Cardinale’s virus detector is nothing more than a simple
substitution of one known element for another to obtain predictable results, as well as
a combination of prior art elements according to known methods to yield a predicta-
ble result. Goodrich Decl.. ¶ 99, Ex. 1003. Further, one of ordinary skill in the art
would have been motivated to combine the teachings of Chen, Cardinale, and Elkan
because they relate to the same field of art, classifiers. Id..
a. Claim 10: “A system for classifying an executable at-tachment in an email received at a computer system”
Chen discloses a “system for classifying an executable attachment in an email
received at a computer system.” Goodrich Decl., ¶ 86, Ex. 1003. Chen discloses an
agent computer program that works “in conjunction with anti-virus software to detect
and remove computer virus[es] that may be in e-mail attachments” of emails received
by a mail server. Chen, 5:3-5; 5:29-30; 6:54-61, Ex. 1007. Accordingly, Chen disclos-
es a system (agent and anti-virus software) for classifying an executable attachment in
an email received at a computer system (mail server). Goodrich Decl. ¶ 86, Ex. 1003.
Patent No. 7,979,907 Petition For Inter Partes Review
17
b. Claim 10: “one or more computer processors execut-ing instructions”
Chen discloses that the agent computer program runs on a server. Chen, 6:51-
53, Ex. 1007. A person of ordinary skill in the art would recognize that a server in-
cludes one or more computer processors executing instructions. Goodrich Decl., ¶
86, Ex. 1003. Therefore, Chen discloses “one or more computer processors execut-
ing instructions.” Id.
c. Claim 10: “a) an email filter configured to filter said executable attachment from said email”
Chen discloses “an email filter configured to filter said executable attachment
from said email.” Goodrich Decl., ¶ 87, Ex. 1003. The BRC of the term “filter” is
“extract.” Goodrich Decl., ¶ 53, Ex. 1003.
Chen discloses that the agent computer program (which may also be referred to
as the “agent”) “determines whether an attachment is present in an e-mail message.”
Chen, 7:41-43, Ex. 1007. Chen describes that an email attachment may be an execut-
able attachment. Chen, 3:21-22, Ex. 1007. “If an attachment is present in an e-mail
message, the agent detaches the attachment … and [] sends the attachment to [the]
anti-virus application” so that it can be scanned for viruses. Chen, 7:48-51, Ex. 1007.
“Detaching” as used here means “extracting.” Goodrich Decl. ¶ 87, Ex. 1003.
Therefore, Chen discloses the agent (email filter) extracting an executable at-
tachment from an email, which one of ordinary skill in the art would recognize corre-
Patent No. 7,979,907 Petition For Inter Partes Review
18
sponds to filtering the executable attachment from the email. Goodrich Decl., ¶ 88,
Ex. 1003. Accordingly, Chen discloses “an email filter configured to filter said exe-
cutable attachment from said email.” Id.
d. Claim 10: “b) a feature extractor configured to extract a byte sequence feature from said executable attach-ment”
Chen in view of Cardinale discloses “a feature extractor configured to extract a
byte sequence feature from said executable attachment.” Goodrich Decl., ¶ 88, Ex.
1003. The BRC of the term “byte sequence feature” is a “feature that is a representa-
tion of machine code instructions of the executable, where a ‘feature’ is a property or
attribute of data which may take on a set of values.” Goodrich Decl., ¶ 52, Ex. 1003.
Cardinale discloses a virus scanner “developed to evaluate the byte patterns in-
side files,” specifically executable files. Cardinale, p. 114, sec. 3.5.3.1.2; p. 172, sec.
5.5.2, Ex. 1008. To classify an executable file, the virus scanner (feature extractor) ex-
tracts 16 byte sequences from the entire file, one at a time. Cardinale, p. 173, sec.
5.5.3, Ex. 1008. As is known by a person of ordinary skill in the art, an executable file
includes machine code instructions. Goodrich Decl., ¶ 88, Ex. 1003. Since 16 byte
sequences are read from the entire executable file, it is obvious that one or more of
the 16 byte sequences will be a feature that is a representation of machine code in-
structions of the executable. Id. Thus, Cardinale discloses “a feature extractor con-
figured to extract a byte sequence feature from said executable attachment.” Id.
Patent No. 7,979,907 Petition For Inter Partes Review
19
e. Claim 10: “c) a rule evaluator configured to: classify said executable attachment by comparing said byte sequence feature of said executable attachment to a classification rule set derived from byte sequence fea-tures of a set of executables having a predetermined class in a set of classes”
Chen in view of Cardinale discloses “a rule evaluator configured to: classify said
executable attachment by comparing said byte sequence feature of said executable at-
tachment to a classification rule set derived from byte sequence features of a set of
executables having a predetermined class in a set of classes.” Goodrich Decl., ¶¶ 89-
90, Ex. 1003.
First, Cardinale discloses “a classification rule set derived from byte sequence
features of a set of executables having a predetermined class in a set of classes.”
Goodrich Decl., ¶ 89, Ex. 1003. The BRC of the term “classification rule set” is a “a
set of hypotheses that predict classification.” Goodrich Decl., ¶ 54, Ex. 1003.
Cardinale discloses a component referred to as “HEC” which creates a set of
detectors used to classify both nonself (viral) and self (nonviral) files. Cardinale, p. 30,
sec. 1.4; p. 140, sec. 5.3; and p. 171, sec. 5.4, Ex. 1008. Cardinale also refers to the de-
tectors as byte patterns and signatures. Cardinale, p. 30, sec. 1.4; p. 140, sec. 5.3; p.
170, sec. 5.3.4; and p. 171, sec. 5.4, Ex. 1008.
Cardinale explains that the set of detectors are created by the HEC using train-
ing samples that include nonself and self examples. Cardinale, p. 49, sec. 2.4.1; p. 51,
sec. 2.4.1.1, Ex. 1008. From the training samples, the HEC creates hypotheses, which
Patent No. 7,979,907 Petition For Inter Partes Review
20
are candidate detectors. Cardinale, p. 139, sec. 5.2; p. 140, sec. 5.3.1, Ex. 1008. The
HEC creates the hypotheses by using “three selection rules” to select bytes from the
training samples: “chunking, sliding window, and every other byte sliding window.”
Cardinale, p. 143, sec. 5.3.1.3, Ex. 1008. A score is calculated for each hypothesis that
indicates how well the hypothesis classifies examples. Cardinale, p. 141, sec. 5.3.1.1;
p. 156, sec. 5.3.2, Ex. 1008. If the score of a hypothesis is acceptable, a detector is de-
rived from the hypothesis and added to a knowledge base. Cardinale, p. 170, sec.
5.3.4, Ex. 1008. The knowledge base includes hypotheses “used to classify files as self
or nonself.” Cardinale, p. 171, sec. 5.4, Ex. 1008.
Thus, Cardinale’s set of detectors in the knowledge base (hypotheses used to
predict classification) correspond to the claimed “classification rule set derived from
byte sequence features of a set of executables having a predetermined class in a set of
classes.” Goodrich Decl., ¶ 89, Ex. 1003. Accordingly, Cardinale discloses the “clas-
sification rule set” element of claim 10. Id.
Further, Cardinale discloses that “a rule evaluator is configured to: classify said
executable attachment by comparing said byte sequence feature of said executable at-
tachment to a classification rule set.” Goodrich Decl., ¶ 90, Ex. 1003. Cardinale dis-
closes that the virus “[s]canner is responsible for determining the classification of a
file based upon the self and nonself detectors created by HEC.” Cardinale, p. 172,
sec. 5.5, Ex. 1008. Cardinale explains that to determine the classification of the exe-
Patent No. 7,979,907 Petition For Inter Partes Review
21
cutable file, 16 byte sequences extracted from the file are compared to the detectors
included in the knowledge base. Cardinale, p. 127, sec. 4.6.2; p. 173, sec. 5.5.3, Ex.
1008. Thus, Cardinale’s virus scanner determining the classification of an executable
file by comparing the extracted 16 byte sequences to the knowledge base detectors
corresponds to “a rule evaluator [] configured to: classify said executable attachment
by comparing said byte sequence feature of said executable attachment to a classifica-
tion rule set.” Goodrich Decl., ¶ 90, Ex. 1003.
Based on the above, Cardinale discloses “a rule evaluator configured to: classify
said executable attachment by comparing said byte sequence feature of said executable
attachment to a classification rule set derived from byte sequence features of a set of
executables having a predetermined class in a set of classes.” Goodrich Decl., ¶¶ 89-
90, Ex. 1003.
f. Claim 10: “determine a probability that said executa-ble attachment is a member of a class of said set of classes based on said byte sequence feature, and di-vide the determination of said probability into a plu-rality of processing steps and to execute said pro-cessing steps in parallel”
Chen in view of Cardinale, and further in view of Elkan discloses “deter-
min[ing] a probability that said executable attachment is a member of a class of said
set of classes based on said byte sequence feature, and divide the determination of
said probability into a plurality of processing steps and to execute said processing
steps in parallel.” Goodrich Decl. ¶¶ 91-92, Ex. 1003.
Patent No. 7,979,907 Petition For Inter Partes Review
22
First, Chen, Cardinale, and Elkan in combination disclose “determin[ing] a
probability that said executable attachment is a member of a class of said set of classes
based on said byte sequence feature.” Goodrich Decl., ¶ 91, Ex. 1003. Elkan disclos-
es multiple Naïve Bayes classifiers learned in series, “where each classifier in the series
pays more attention to the examples misclassified by its predecessor.” Elkan, p. 5,
sec. 4, Ex. 1012. Once the Naïve Bayes classifiers are learned and input attributes are
provided for classification, each individual Naïve Bayes classifier produces an output
for the input. Id. A combined output H is determined by “applying a sigmoid func-
tion to a weighted sum of the outputs of the individual classifiers.” Id.
A person of ordinary skill in the art would recognize that the combined output
determined using the sigmoid function is a determined probability that the input at-
tributes are a member of a class of a set of classes. Goodrich Decl., ¶ 91, Ex. 1003. A
person ordinary skill in the art would further recognize that in view of Chen and Car-
dinale the combined output determined by Elkan is for an executable attachment and
determined based on byte sequence features extracted from the attachment. Id. Thus,
the combination of Chen, Cardinale, and Elkan discloses “determin[ing] a probability
that said executable attachment is a member of a class of said set of classes based on
said byte sequence feature.” Id.
Additionally, Chen, Cardinale, and Elkan in combination disclose “divid[ing]
the determination of said probability into a plurality of processing steps and [] ex-
Patent No. 7,979,907 Petition For Inter Partes Review
23
ecut[ing] said processing steps in parallel.” Goodrich Decl., ¶ 92, Ex. 1003. As de-
scribed above, Elkan’s combined output is determined based on the outputs of each
of the individual Naïve Bayes classifiers. Elkan, p. 5, sec. 4, Ex. 1012. Since each of
the individual Naïve Bayes classifiers is operating (processing steps) to provide a con-
tribution to the combined output, a person of ordinary skill in the art would recognize
that the determination of the probability (combined output) is divided into a plurality
of processing steps. Goodrich Decl., ¶ 92, Ex. 1003.
Further, Elkan discloses that the multiple Naïve Bayes classifiers operate on
parallel computing units. Elkan, p. 1, Abstract; p. 6, sec. 6, Ex. 1012. Since Elkan’s
classifiers are processing steps to determine the probability and the classifiers are op-
erating on parallel computing units, a person of ordinary skill in the art would recog-
nize that Elkan is dividing the determination of the probability into a plurality of pro-
cessing steps and executing the steps in parallel. Goodrich Decl., ¶ 92, Ex. 1003.
Based on the above, the combination of Chen, Cardinale, and Elkan discloses
“determin[ing] a probability that said executable attachment is a member of a class of
said set of classes based on said byte sequence feature, and divide the determination
of said probability into a plurality of processing steps and [] execut[ing] said pro-
cessing steps in parallel.” Goodrich Decl., ¶ 91-92, Ex. 1003.
g. Claim 11: “extract static properties of said executable attachment”
Chen in view of Cardinale, and further in view of Elkan discloses “extract[ing]
Patent No. 7,979,907 Petition For Inter Partes Review
24
static properties of said executable attachment.” Goodrich Decl., ¶ 93, Ex. 1003. The
BRC of the phrase “static properties” is “properties that do not require an executable
to be run in order to be discerned” Goodrich Decl., ¶ 55, Ex. 1003.
Cardinale discloses “extracting static properties of said executable attachment.”
As shown in Section VII(B)(1)(d), Cardinale discloses “extracting a byte sequence fea-
ture from said executable attachment” by extracting 16 byte sequences from the exe-
cutable file to be classified. Cardinale extracts the 16 byte sequences directly from the
contents of the executable file. Cardinale p. 173, sec. 5.5.3, Ex. 1008. A person of
ordinary skill in the art would recognize that the file does not have to be run to extract
the byte sequences. Goodrich Decl., ¶ 93, Ex. 1003. Therefore, the byte sequences
extracted by Cardinale are static properties. Id. Accordingly, Cardinale discloses the
elements of claims 11. Id.
h. Claim 15: “predict the classification of said executable attachment as one class of a set of classes consisting of malicious, benign, and borderline”
Chen in view of Cardinale, and further in view of Elkan discloses “predict[ing]
the classification of said executable attachment as one class of a set of classes consist-
ing of malicious, benign, and borderline” Goodrich Decl., ¶¶ 94-95, Ex. 1003.
As shown in VII(B)(1)(e), Cardinale predicts the classification of an executable
file by comparing 16 byte sequences extracted from the executable file to the detec-
tors included in the knowledge base. Cardinale discloses classifying the file “as self if
Patent No. 7,979,907 Petition For Inter Partes Review
25
one or more self detectors are found in the file and no nonself detectors are found.”
Cardinale, p. 174, sec. 5.5.3.2. Ex. 1008. A person of ordinary skill in the art would
recognize that a self classification corresponds to a benign classification. Goodrich
Decl., ¶ 94, Ex. 1003. The “file is classified as nonself if any nonself detector is
found.” Cardinale, p. 174, sec. 5.5.3.3, Ex. 1008. A person of ordinary skill in the art
would recognize that a nonself classification corresponds to a malicious classification.
Goodrich Decl. ¶ 94, Ex. 1003.
Further, if no hypotheses are found in the file, the file is flagged as indiscerni-
ble, considered unclassified (cannot be classified as either self or nonself), and sent to
a virus expert to determine if the file is infected. Cardinale, p. 127, sec. 4.6.2; p. 173-
174, sec. 5.5.3.1, Ex. 1008. A person of ordinary skill in the art would recognize that
flagging the file as indiscernible and sending it to the expert signifies that it is unclear
whether the file is self or nonself (i.e., the file is borderline). Goodrich Decl.,¶ 95, Ex.
1003. Therefore, a person of ordinary skill in the art would recognize that flagging
the file as indiscernible corresponds to a “borderline” classification. Id. Accordingly,
Cardinale discloses the virus scanner configured to predict the classification of an exe-
cutable attachment as one class of a set of classes consisting of malicious (nonself),
benign (self), and borderline (indiscernible). Id.
Patent No. 7,979,907 Petition For Inter Partes Review
26
i. Claim 16: “determine said probability that said exe-cutable attachment is a member of one class of said set of classes with a Naive Bayes algorithm”
Chen in view of Cardinale, and further in view of Elkan discloses “deter-
min[ing] said probability that said executable attachment is a member of one class of
said set of classes with a Naive Bayes algorithm” Goodrich Decl., ¶ 96, Ex. 1003.
As shown in Section VII(B)(1)(f), the combination of Chen, Cardinale, and
Elkan discloses determining a probability that an executable attachment is a member
of a class of a set of classes. Elkan discloses determining the probability (combined
output) based on the outputs of multiple classifiers. Elkan, p. 5, sec. 4, Ex. 1012.
Elkan discloses that each of the classifiers is a Naïve Bayes classifier. Id. Therefore,
Elkan discloses determining the probability with a Naïve Bayes algorithm. Goodrich
Decl., ¶ 96, Ex. 1003.
j. Claim 17: “determine said probability that said exe-cutable attachment is a member of a class of said set of classes with a multi-Naive Bayes algorithm”
Chen in view of Cardinale, and further in view of Elkan discloses “deter-
min[ing] said probability that said executable attachment is a member of a class of said
set of classes with a multi-Naive Bayes algorithm.” Goodrich Decl., ¶ 97, Ex. 1003.
As shown in Section VII(B)(1)(f), the combination of Chen, Cardinale, and
Elkan discloses determining a probability that an executable attachment is a member
of a class of a set of classes. Elkan discloses determining the probability (combined
Patent No. 7,979,907 Petition For Inter Partes Review
27
output) by “applying a sigmoid function to a weighted sum of the outputs of the indi-
vidual classifiers.” Elkan, p. 5, sec. 4, Ex. 1012. Since multiple Naïve Bayes classifiers
are used to classify and each individual classifier contributes to the determined proba-
bility, a person of ordinary skill in the art would recognize that the algorithm applied
by Elkan to determine the probability is a multi-Naïve Bayes algorithm. Goodrich
Decl., ¶ 97, Ex. 1003. Therefore, the combination of Chen, Cardinale, and Elkan dis-
closes the elements of claim 17. Id.
2. Ground 2: Chen in View of Cardinale and further in view of Elkan and APA Renders Obvious Claim 12 Under 35 U.S.C. 103(a)
Chen in view of Cardinale and further in view of Elkan and APA teaches every
element of dependent claim 12. Claim 12 generally recites converting the executable
attachment from binary format to hexadecimal format. These elements are plainly
taught by the combination of Chen, Cardinale, Elkan, and APA.
A person of ordinary skill in the art would find it obvious to combine APA’s
teachings with the teachings of Cardinale because it is nothing more than a combina-
tion of prior art elements according to known methods to yield a predictable result.
Goodrich Decl., ¶ 102, Ex. 1003. Further, it is obvious to combine Cardinale in view
of Elkan and APA with Chen for the reasons provided in Section VII(B)(1).
Patent No. 7,979,907 Petition For Inter Partes Review
28
a. Claim 12: “convert said executable attachment from binary format to hexadecimal format”
Chen in view of Cardinale, and further in view of Elkan and APA discloses
“convert[ing] the executable attachment from binary format to hexadecimal format.”
Goodrich Decl., ¶ 101, Ex. 1003. APA discloses that “[h]exdump, as is known in the
art … is an open source tool that transforms binary files into hexadecimal files.”
’907 patent, 6:12-18, Ex. 1001. Therefore, APA discloses “converting [an] executable
program from binary format to hexadecimal format” since APA acknowledges the ex-
istence of a prior art tool for this express purpose. Goodrich Decl., ¶ 101, Ex. 1003.
Accordingly, Chen, Cardinale, and APA in combination disclose the elements
of claim 12. Id. A person of ordinary skill in the art would recognize that it is obvi-
ous for Cardinale’s virus detector to use hexdump to convert an executable file from
binary format to hexadecimal format when extracting byte sequences from the file be-
cause hexadecimal is one of two practical and commonly used ways of representing
binary data. Id.
3. Ground 3: Chen in View of Cardinale and further in view of Elkan and Szor Renders Obvious Claim 13 Under 35 U.S.C. 103(a)
Chen in view of Cardinale and further in view of Elkan and Szor teaches every
element of dependent claim 13. Claim 13 generally recites creating a byte string repre-
sentative of resources referenced by an executable attachment. These elements are
plainly taught by the combination of Chen, Cardinale, Elkan, and Szor.
Patent No. 7,979,907 Petition For Inter Partes Review
29
A person of ordinary skill in the art would find it obvious to combine Cardi-
nale’s teachings with Szor’s teachings for extracting byte sequence features. Goodrich
Decl., ¶ 106, Ex. 1003. Cardinale describes that future iterations of its virus scanner
should employ heuristics from current antivirus programs to supplement its ability to
detect previously unseen invaders. Cardinale, p. 22, sec. 1.1, Ex. 1008. Therefore,
combining Cardinale’s teachings with Szor’s teachings is nothing more than combin-
ing prior art elements according to known methods to yield a predictable result.
Goodrich Decl., ¶ 106, Ex. 1003. Further, it is obvious to combine Cardinale in view
of Elkan and Szor with Chen for the reasons provided in Section VII(B)(1). Id. One
of ordinary skill in the art would have been motivated to combine the teachings of
Chen, Cardinale, Elkan, and Szor because all four references relate to the same field of
art, classifiers. Id.
a. Claim 13: “create a byte string representative of re-sources referenced by said executable attachment”
Chen in view of Cardinale, and further in view of Elkan and Szor discloses
“creat[ing] a byte string representative of resources referenced by said executable at-
tachment.” Goodrich Decl., ¶¶ 104-105, Ex. 1003. For the same reasons provided in
Section VII(B)(1)(d), Cardinale discloses “extracting a byte sequence feature from said
executable attachment” by extracting 16 byte sequences from the executable file to be
classified.
Patent No. 7,979,907 Petition For Inter Partes Review
30
Szor shows that Cardinale also discloses “creat[ing] a byte string representative
of resources referenced by said executable attachment” when extracting the 16 byte
sequences. Szor describes various features which can be “useful to detect 32-bit Win-
dows viruses heuristically” and cause a heuristic flag to be set. Szor, p. 24, sec. 5, Ex.
1010. One feature that is suspicious is if the functions GetProcAddress or GetMod-
uleHandleA are imported by a file from KERNEL32.DLL. Szor, p. 26, sec. 5.1.8.,
Ex. 1010. Another feature that is suspicious is if the functions GetProcAddress and
GetModuleHandleA are both imported by the file from KERNEL32.DLL at the
same time. Szor, p. 26, sec. 5.1.9, Ex. 1010.
Thus, Szor shows that files (e.g., malicious executable attachments) will refer-
ence resources, such as DLL and DLL functions. Goodrich Decl., ¶ 104, Ex. 1003.
As described above, Cardinale extracts 16 byte sequences from an entire executable
file to be classified. Cardinale, p. 173, sec. 5.5.3, Ex. 1008. A person of ordinary skill
in the art would recognize that in view of Szor, if the file is infected with a virus that
imports DLL functions, one or more byte sequences extracted by Cardinale for the
file will be a created byte string representative of resources referenced by the executa-
ble file. Goodrich Decl., ¶ 105, Ex. 1003. Accordingly, Cardinale in view of Szor dis-
closes the elements of claim 13. Id.
Patent No. 7,979,907 Petition For Inter Partes Review
31
4. Ground 4: Chen in View of Cardinale and further in view of Elkan and Arnold Renders Obvious Claim 14 Under 35 U.S.C. 103(a)
Chen in view of Cardinale and further in view of Elkan and Arnold teaches
every element of dependent claim 14. Claim 14 generally recites predicting the classi-
fication of an executable attachment as one class of a set of classes consisting of mali-
cious and benign. These elements are plainly taught by the combination of Chen,
Cardinale, Elkan, and Arnold.
A person of ordinary skill in the art would find it obvious to combine Arnold’s
teachings with Cardinale’s teaching for classifying files as either malicious or benign
because it is nothing more than a simple substitution of one known element for an-
other to obtain predictable results. Goodrich Decl., ¶ 110, Ex. 1003. Further, it is
obvious to combine Cardinale in view of Elkan and Arnold with Chen for the reasons
provided in Section VII(B)(1). Id. One of ordinary skill in the art would have been
motivated to combine the teachings of Chen, Cardinale, Elkan, and Arnold because all
four references relate to the same field of art, classifiers. Id.
a. Claim 14: “predict the classification of said executable attachment as one class of a set of classes consisting of malicious and benign”
Chen in view of Cardinale, and further in view of Elkan and Arnold discloses
“predict[ing] the classification of said executable attachment as one class of a set of
classes consisting of malicious and benign.” Goodrich Decl., ¶¶ 108-109, Ex. 1003.
Patent No. 7,979,907 Petition For Inter Partes Review
32
As described in Section VII(B)(1)(e), Cardinale predicts the classification of an exe-
cutable file by comparing 16 byte sequences extracted from the executable file to the
detectors included in the knowledge base.
Arnold discloses that a file is classified as one class of a set of classes consisting
of malicious and benign. Goodrich Decl. ¶ 109, Ex. 1003. Arnold describes a neural
network classifier made up of eight linear networks. Arnold, p. 6, Ex. 1009. For a file
to be classified by the classifier, an input vector is generated based on which n-gram
features the file includes. Arnold, p. 5, Ex. 1009. Each network calculates an output
value O for the input vector, with a value between 0.0 and 1.0. Id.
If the output value O is above the threshold, a discrete output of 1 is output by
the network, indicating the file is infected. Id. If the output value O is below the
threshold, a discrete output of zero is output by the network, indicating the file is not
infected. Id. The discrete outputs of the networks are summed to determine an over-
all output for the classifier. Arnold, p. 6, Ex. 1009. To determine the overall output,
the sum is compared to a threshold V. Id.. If the sum is greater than V, the output of
the classifier is 1, indicating that the file has been classified as infected. Id. If the sum
is less than V, the output of the classifier is 0, indicating that the file has been classi-
fied as uninfected. Id.
Therefore, Arnold is predicting the classification of a file as one class of a set of
classes consisting of malicious (infected) and benign (uninfected). Goodrich Decl.,¶
Patent No. 7,979,907 Petition For Inter Partes Review
33
109, Ex. 1003. Accordingly, the combination of Chen, Cardinale, Elkan, and Arnold
discloses the elements of claim 14.
5. Ground 5: Chen in View of Cardinale and further in view of Elkan and Forman Renders Obvious Claims 1, 2, 6-9, and 18-20 Under 35 U.S.C. 103(a)
Chen in view of Cardinale and further in view of Elkan and Forman teaches
every element of claims 1, 2, 6-9, and 18-20. Independent claim 6 is similar to inde-
pendent claim 10 discussed in Section VII(B)(1), with a few exceptions. Independent
claim 6 recites four elements: 1) a “filtering” element similar to the “filtering” element
of claim 10; 2) an “extracting” element similar to the “extracting” element of claim 10;
3) a “classifying” element similar to the “classifying” element of claim 10; and 4) a
“determining” element in which a probability is determined using a multi-Naïve Bayes
algorithm that the executable attachment is a member of each class in a set of classes
based on a byte sequence feature. The determining of the probability is divided into a
plurality of processing steps executed in parallel. Thus, the primary difference from
claim 10 is that the probability is determined using a multi-Naïve Bayes algorithm for
each class in a set of classes. Each of these elements is plainly taught by the combina-
tion of Chen, Cardinale, Elkan, and Forman. Dependent claims 2, 6-9, and 18-20 re-
cite additional features related to these elements which are also disclosed by the com-
bination of Chen, Cardinale, Elkan, and Forman.
Patent No. 7,979,907 Petition For Inter Partes Review
34
A person of ordinary skill in the art would find it obvious to combine the Na-
ive Bayes classifier described by Forman with Elkan’s multi-Naïve Bayes algorithm
because it is nothing more than a simple substitution of one known element for an-
other to obtain predictable results. Goodrich Decl., ¶ 118, Ex. 1003. Additionally, a
person of ordinary skill in the art would find it obvious to combine Elkan in view of
Forman with Cardinale for the reasons provided in Section VII(B)(1). Id. Further, a
person of ordinary skill in the art would find it obvious to combine Cardinale in view
of Elkan and Forman with Chen for the reasons provided in Section VII(B)(1). Id.
One of ordinary skill in the art would have been motivated to combine the teachings
of Chen, Cardinale, Elkan, and Forman because all four references relate to the same
field of art, classifiers. Id.
a. Claim 1: “A method for classifying an executable at-tachment in an email received at a computer system”
For the same reasons provided in Section VII(B)(1)(a), Chen discloses a
“method for classifying an executable attachment in an email received at a computer
system.”
b. Claim 1: “a) filtering said executable attachment from said email”
For the same reasons provided in Section VII(B)(1)(c), Chen discloses “filtering
said executable attachment from said email.”
Patent No. 7,979,907 Petition For Inter Partes Review
35
c. Claim 1: “extracting a byte sequence feature from said executable attachment”
For the same reasons provided in Section VII(B)(1)(d), Cardinale discloses “ex-
tracting a byte sequence feature from said executable attachment.”
d. Claim 1: “c) classifying said executable attachment by comparing said byte sequence feature of said executa-ble attachment with a classification rule set derived from byte sequence features of a set of executables having a predetermined class in a set of classes”
For the same reasons provided in Section VII(B)(1)(e), Cardinale discloses
“classifying said executable attachment by comparing said byte sequence feature of
said executable attachment with a classification rule set derived from byte sequence
features of a set of executables having a predetermined class in a set of classes.”
e. Claim 1: “wherein said classifying comprises deter-mining using a computer processor, with a Multi-Naive Bayes algorithm, a probability that said execut-able attachment is a member of each class in said set of classes based on said byte sequence feature and di-viding said step of determining said probability into a plurality of processing steps and executing said pro-cessing steps in parallel”
Chen in view of Cardinale, and further in view of Elkan and Forman discloses
“wherein said classifying comprises determining using a computer processor, with a
Multi-Naive Bayes algorithm, a probability that said executable attachment is a mem-
ber of each class in said set of classes based on said byte sequence feature and dividing
said step of determining said probability into a plurality of processing steps and exe-
Patent No. 7,979,907 Petition For Inter Partes Review
36
cuting said processing steps in parallel.” Goodrich Decl., ¶¶ 112-114, Ex. 1003.
First, for the same reasons provided in Section VII(B)(1)(j), Elkan discloses us-
ing multiple Naïve Bayes classifiers, more specifically a multi-Naïve Bayes algorithm,
for classifying. Forman describes a Naïve Bayes classifier that assigns a probability to
each potential class to which an instance may belong. Forman, 4:38-40, Ex. 1011.
Forman describes that an instance may be a data item or record. Forman, Abstract,
Ex. 1011. Therefore, an instance may be an executable attachment. Goodrich Decl.,
¶ 113, Ex. 1003.
A person of ordinary skill in the art would recognize that it is obvious to modi-
fy Elkan to use Forman’s Naïve Bayes classifier for each of its Naïve Bayes classifiers.
Id. In view of Forman each of the Naïve Bayes classifiers determines a probability for
each class in the set of classes and the probabilities are used by the classifier to gener-
ate an output. Id. As described by Elkan, the outputs of the multiple classifiers are
summed for classifying a file. Elkan, p. 5, sec. 4, Ex. 1012. Therefore, the combina-
tion of Elkan and Forman’s teachings corresponds to determining using a Multi-
Naive Bayes algorithm, a probability that said executable attachment is a member of
each class in a set of classes based on said byte sequence feature. Goodrich Decl., ¶
113, Ex. 1003.
Further, the combination of Elkan and Forman discloses dividing the determi-
nation of the probabilities into a plurality of processing steps and executing the steps
Patent No. 7,979,907 Petition For Inter Partes Review
37
in parallel. Goodrich Decl., ¶ 114, Ex. 1003. Since each of the multiple Naïve Bayes
classifiers determines its respective probabilities, the combination of Elkan and For-
man discloses dividing the steps of determining the probabilities into a plurality of
processing steps. Id. Further, Elkan discloses that the multiple Naïve Bayes classifiers
operate on parallel computing units. Elkan, p. 1, Abstract and p. 6, sec. 6, Ex. 1012.
Since the Naïve Bayes classifiers are operating on parallel computing units, the com-
bination of Elkan and Forman discloses that the processing steps executed by the
classifiers are executed in parallel. Goodrich Decl., ¶ 114, Ex. 1003.
Based on the above, the combination of Chen, Cardinale, Elkan, and Forman
discloses “wherein said classifying comprises determining using a computer processor,
with a Multi-Naive Bayes algorithm, a probability that said executable attachment is a
member of each class in said set of classes based on said byte sequence feature and
dividing said step of determining said probability into a plurality of processing steps
and executing said processing steps in parallel.” Goodrich Decl., ¶¶ 113-114, Ex.
1003.
f. Claim 2: “extracting static properties of said executa-ble attachment”
For the same reasons provided in Section VII(B)(1)(g), Cardinale discloses a
“extracting static properties of said executable attachment.”
Patent No. 7,979,907 Petition For Inter Partes Review
38
g. Claim 6: “determining a probability that said executa-ble attachment is a member of each class in a set of classes consisting of malicious, benign, and border-line”
As shown in Section VII(B)(1)(h), Cardinale discloses that set of classes under
which an executable file may be classified consists of benign (self), malicious (non-
self), and borderline (indiscernible). Further, as shown in Section VII(B)(5)(e), For-
man discloses a Naïve Bayes classifier assigning a probability to each potential class to
which an instance may belong. Therefore, the combination of Cardinale and Forman
discloses determining a probability that said executable attachment is a member of
each class in a set of classes consisting of malicious, benign, and borderline. Goodrich
Decl., at ¶ 115, Ex. 1003.
h. Claims 7 and 18: “classify[ing] said executable at-tachment as malicious if said probability that said ex-ecutable attachment is malicious is greater than said probability that said executable attachment is be-nign”
Chen in view of Cardinale, and further in view of Elkan and Forman discloses
“classify[ing] said executable attachment as malicious if said probability that said exe-
cutable attachment is malicious is greater than said probability that said executable at-
tachment is benign.” Goodrich Decl., ¶ 116, Ex. 1003. As shown in Section
VII(B)(5)(g), Cardinale and Forman in combination disclose determining a probability
that the attachment is malicious and a probability that the attachment is benign. Id.
Further, Forman discloses that, for an instance, a preselected number of classes
Patent No. 7,979,907 Petition For Inter Partes Review
39
having the highest probabilities are selected. Forman, 4:40-44, Ex. 1011. Thus, com-
bination of Cardinale and Forman discloses that if the preselected number of classes
is one and the malicious probability is the greatest (greater than the benign probabil-
ity), the executable attachment is classified as malicious. Goodrich Decl., ¶ 116, Ex.
1003. Accordingly, the combination of Chen, Cardinale, Elkan, and Forman discloses
the elements of claims 7 and 18. Id.
i. Claims 8 and 19: “classify[ing] said executable at-tachment as benign if said probability that said exe-cutable attachment is benign is greater than said probability that said executable attachment is mali-cious”
Chen in view of Cardinale, and further in view of Elkan and Forman discloses
“classify[ing] said executable attachment as benign if said probability that said execut-
able attachment is benign is greater than said probability that said executable attach-
ment is malicious.” Goodrich Decl., ¶ 116, Ex. 1003. As shown in Section
VII(B)(5)(g), Cardinale and Forman in combination disclose determining a probability
that the attachment is malicious and a probability that the attachment is benign. Id.
Further, Forman discloses that, for an instance, a preselected number of classes
having the highest probabilities are selected. Forman, 4:40-44, Ex. 1011. Thus, the
combination of Cardinale and Forman discloses that if the preselected number of
classes is one and the benign probability is the greatest (greater than the malicious
probability), the executable attachment is classified as benign. Goodrich Decl., ¶ 116,
Patent No. 7,979,907 Petition For Inter Partes Review
40
Ex. 1003. Accordingly, the combination of Chen, Cardinale, Elkan, and Forman dis-
closes the elements of claims 8 and 19. Id.
j. Claims 9 and 20: “classify[ing] said executable at-tachment as borderline if a difference between said probability that said executable attachment is benign and said probability that said executable attachment is malicious is within a predetermined threshold”
In view of the combination of Chen, Cardinale, Elkan, and Forman it is obvi-
ous to “classify said executable attachment as borderline if a difference between said
probability that said executable attachment is benign and said probability that said ex-
ecutable attachment is malicious is within a predetermined threshold.” Goodrich
Decl., ¶ 117, Ex. 1003. As shown in Section VII(B)(5)(g), Cardinale and Forman in
combination disclose determining a probability that the attachment is malicious and a
probability that the attachment is benign. Goodrich Decl., ¶ 116, Ex. 1003. A person
of ordinary skill in the art would recognize that if the two probabilities are sufficiently
close to each other (difference between the benign probability and the malicious
probability is within a threshold), that it would be desirable to classify the file as a
third class (indiscernible/borderline as described by Cardinale) in order to avoid an
incorrect classification of the file. Goodrich Decl., ¶ 117, Ex. 1003. Thus, the ele-
ments of claims 9 and 20 are obvious in view of the combination of Chen, Cardinale,
and Forman. Id.
Patent No. 7,979,907 Petition For Inter Partes Review
41
6. Ground 6: Chen in View of Cardinale and further in view of Elkan, Forman, and Arnold Renders Obvious Claim 5 Under 35 U.S.C. 103(a)
Chen in view of Cardinale and further in view of Elkan, Forman and Arnold
teaches every element of dependent claim 5. Claim 5 generally recites determining a
probability that the executable attachment is a member of each class in a set of classes
consisting of malicious and benign. These elements are plainly taught by the combi-
nation of Chen, Cardinale, Elkan, Forman, and Arnold.
A person of ordinary skill in the art would find it obvious to combine Arnold’s
teachings with Forman’s teaching because it is nothing more than combining prior art
elements according to known methods to yield a predictable result. Goodrich Decl., ¶
121, Ex. 1003.
Additionally, a person of ordinary skill in the art would find it obvious to com-
bine Forman in view of Arnold with Elkan for the reasons provided in Section
VII(B)(5). Id. Further, a person of ordinary skill in the art would find it obvious to
combine Elkan in view of Forman and Arnold with Cardinale for the reasons provid-
ed in Section VII(B)(1). Id. Further, a person of ordinary skill in the art would find it
obvious to combine Cardinale in view of Elkan, Forman, and Arnold with Chen for
the reasons provided in Section VII(B)(1). Id. One of ordinary skill in the art would
have been motivated to combine the teachings of Chen, Cardinale, Elkan, Forman,
and Arnold because all five references relate to the same field of art, classifiers. Id.
Patent No. 7,979,907 Petition For Inter Partes Review
42
a. Claim 5: “determining a probability that said executa-ble attachment is a member of each class in a set of classes consisting of malicious and benign”
Chen in view of Cardinale, and further in view of Elkan, Forman, and Arnold
discloses “determining a probability that said executable attachment is a member of
each class in a set of classes consisting of malicious and benign.” Goodrich Decl., ¶
120, Ex. 1003. As shown in Section VII(B)(5)(e), Forman discloses a Naïve Bayes
classifier assigning a probability to each potential class to which an instance may be-
long. Further, as shown in VII(B)(4)(a), Arnold discloses that the set of classes under
which a file may be classified consists of malicious and benign. Therefore, the com-
bination of Forman and Arnold discloses determining a probability that said executa-
ble attachment is a member of each class in a set of classes consisting of malicious
and benign. Id.
7. Ground 7: Chen in View of Cardinale and further in view of Elkan, Forman, and APA Renders Obvious Claim 3 Under 35 U.S.C. 103(a)
For the same reasons provided in Section VII(B)(2)(a), the combination of
Chen, Cardinale, Elkan, Forman, and APA discloses the elements of claim 3.
8. Ground 8: Chen in View of Cardinale and further in view of Elkan, Forman, and Szor Renders Obvious Claim 4 Under 35 U.S.C. 103(a)
For the same reasons provided in Section VII(B)(3)(a), the combination of
Chen, Cardinale, Elkan, Forman, and APA discloses the elements of claim 4.
Patent No. 7,979,907 Petition For Inter Partes Review
43
VIII. CONCLUSION
For the reasons given above, inter partes review under 35 U.S.C. § 311 and 37
C.F.R. § 42.101 of United States Patent No. 7,979,907 to Schultz et al., titled “System
and Methods for Detection of New Malicious Executables” is hereby requested.
Respectfully submitted, /David D. Schumann/ David D. Schumann Reg. No. 53,569 December 5, 2014
CERTIFICATION OF SERVICE ON PATENT OWNER (37 C.F.R. § 42.101(a))
The undersigned hereby certifies that the foregoing Petition for Inter Partes
Review of U.S. Patent No. 7,979,907 (“the ‘'907 patent”), and associated Exhibits
1001-1012, was served on December 5, 2014, in its entirety by FedEx upon the
following:
The Trustees of Columbia University in the City of New York c/o Baker Botts L.L.P. 30 Rockefeller Plaza 44th Floor New York, NY 10112-4498 Patent owner’s correspondence address of record for USP 7,979,907
FENWICK & WEST, LLP /Brian Hoffman/ Brian M. Hoffman Attorney for Petitioner Registration No. 39,713 Date: December 5, 2014 555 California Street San Francisco, CA Tel: (650) 988-8500