
doi: 10.13016/m2sk5f-nyzm
In today’s world, patents play an important role in helping inventors and organizations protect their intellectual property. With a rapid increase in the number of patents granted over the last 25 years, it has become important to create tools and methodologies that facilitate better understanding of this large corpus. This thesis aims to classify patents by the assignee, the assignee being the company that owns the patent. A text classification approach is used. Six companies/organizations are chosen as assignees/owners of the patents, which are: Amazon Technologies, Apple Inc., Google Llc, International Business Corporation (IBM), Intel Corporation, Microsoft Corporation. Two machine learning models are trained for classification: Naive Bayes model and Neural Network model. Two experiments are performed, extracting only the abstract for the first one and extracting abstract and claims for the second one. Python scripts are used to download the patent documents, extract the data items of interest, pre-process the dataset and train and test the machine learning models. The results obtained are analysed and the performances of the classifiers are compared. The best performing model was the Neural Network implementation using Keras with an accuracy of 79.02%.
Machine Learning, NLP, Patent Classification, Patents
Machine Learning, NLP, Patent Classification, Patents
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
