Full Record

New Search | Similar Records

Author
Title CROSS-ENTROPY APPROACHES TO SOFTWARE FORENSICS: SOURCE CODE AUTHORSHIP IDENTIFICATION
URL
Publication Date
Degree PhD
Discipline/Department Computer Science and Engineering
Degree Level doctoral
University/Publisher Mississippi State University
Abstract Identification of source code authorship can be a useful tool in the areas of security and forensic investigation by helping to create corroborating evidence that may send a suspected cyber terrorist, hacker, or malicious code writer to jail. When applied to academia, it can also prove a useful tool for professors who suspect students of academic dishonesty, plagiarism, or modification of source code related to programming assignments. The purpose of this dissertation is to determine whether or not cross-entropy approaches to source code authorship analysis will succeed in predicting the correct author of a given piece of source code. If so, this work will try to identify factors that affect the accuracy of the algorithm, how programmer experience determines accuracy, and whether a cross-entropy approach performs better than some known source code authorship approaches. The approach taken in the research effort will manufacture a corpus of source code writings from various authors based on the same system descriptions and varying system descriptions, from which benchmarks of different approaches can be measured.
Subjects/Keywords authorship identification; source code authorship
Contributors Dr. David Dampier (chair); Dr. Rayford Vaughn (committee_member); Dr. T.J. Jankun-Kelly (committee_member); Dr. Cary Butler (committee_member)
Language en
Rights unrestricted
Country of Publication us
Format application/pdf
Record ID oai:library.msstate.edu:etd-10312011-142636
Repository msstate
Date Retrieved
Date Indexed 2018-01-03
Grantor MSSTATE

Sample Search Hits | Sample Images

…cross-entropy approach to author identification has been applied to literary works of multiple natural languages; however it has not been applied to authorship attribution for source code. This work will focus on factors that affect the accuracy of the…

…cross-entropy approach, how programmer experience determines identification accuracy, and whether or not a cross-entropy approach performs better or worse than some other known source code authorship approaches. The remainder of this chapter is organized…

…1.5 Motivation and Application Some of the earliest works published on source code authorship identification date back to the late 1980’s or early 1990’s [5-7]. New to the computing scene were malicious codes that threatened the evolving…

…possible to identify the author of a piece of source code from the small snippets left behind. 8 The ultimate goal of source code authorship identification is to create techniques, or combinations of techniques, that can be applied in a legal setting to…

…for performing source code authorship identification will be discussed in 11 Chapter II. Evaluation of the hypothesis is based on the research questions in the following paragraphs. More detail of the experimental design and additional research…

…literary works. Will cross-entropy approaches taken in literary document classification and authorship identification fare similarly when compared with cross-entropy approaches applied to source code authorship identification? First, this problem will be…

…55 III. APPLICATION OF CROSS-ENTROPIC APPROACHES TO SOURCE CODE CORPORA: EXPERIMENTAL DESIGN, FOCUS, AND STRUCTURE.........................................................................................59 3.1 Experimental Design and Framework…

…59 3.1.1 Anonymous Source Code Corpora ................................................60 3.1.2 Corpora Construction, Structure, and Attributes ...........................61 3.1.2.1 Student Corpora Construction Overview and Anonymization…

.