# 1. Cosine similarity Given the following 4 documents

1. Cosine similarity
Given the following 4 documents retrieved from the collection of 4; 000; 000 documents in response to query kernel ^ classification ^ SVM ^kernel Consider the following documents:
1. Document D1 : "The SVM is a machine learning algorithm which solves classi_cation problems. The orignal SVM algorithm was designed for linear classi_cation. But the alsgorithm can also be used for non-linear problems using appropriate kernel";
2. Document D2 : "I'm a beginner when it comes to classication with SVMs. Are there ny
guidelines that say which kernel to try for unknown, but nonlinear classi_cation problem?"
3. Document D3 : "Support Vector Machine (SVM) is primarily a classier method that per-
forms classi_cation tasks by constructing hyperplanes. SVM supports both regression and
4. Document D4 : " The goal of this paper is to provide a high-level introduction to the
Kernel Trick commonly used in classi_cation algorithms such as SVMs".
In this collection document frequency of terms "classi_cation", "kernel" and "SVM" are 500,000
and 300,000 and 10,000 respectively. Compute ranking of each document w.r.t the query using atn:atn weighting scheme and cosine similarity measure (see section 6.4.3). Then order documents according to the rank. Use format of the table 6.1, sec 6.4.4. to show intermediate computations (required for getting credit).
