Here I will send you the instruction for the project and i will add the other stuff
CAP 6635: Artificial Intelligence Project Announcement 2022 Spring Term project proposal guideline [Final term project report due by May 6. Firm deadline] The goal of the term project is for students to practice Bayesian network and Naïve bayes classification for text classification. The final outcomes of the project need to be turned in as a formal technical report. Students can choose to use a term project report to substitute the final exam, or they can choose to take the final exam and submit a simplified term project report (Please refer to technical report instructions for details). Term Project and Final Exam All students must participate the term project and submit the term project report. Follow plot below to determine which routine (term project, term project reports, final exam) you will choose. Do you want to use term project to substitute final exam? Can form a team with 3 members maximum Must follow “Research Report Instruction” No Final Exam Report Due: May 6 Can only work solo. Follow “Course Report Instruction” Final Exam on May 4 Report Due: May 6 Term Project All students YES NO Students have options to (1) use term project report to substituent final exam, or (2) turn in a short version of the term project report and participate final exam: Students using term project to substitute final exam: Students can choose to substitute the final exam using term project report. If this is the case, the term project report will contribute 10+15=25 points to your final grade. The “Research Report” needs to have 4,000 words minimum including motivation and description of the research problems, technical solutions, validations, conclusion, and reference etc. (The “Research Report Instruction” posted in the Canvas includes all details). I will assist each student (or each team) to polish the report and try to find a suitable venue to publish the report, if possible. Students participating in final exam: For students who still want to take the final exam, the term project report will contribute 10 points to your final grade. Your report shall have 1,000 words minimum, including statement of the research problem, designs, and validations. (The “Course Report Instruction” posted in the Canvas includes all details). The “standardized term project” is posted as an option for students participating in final exam. But students are welcome to propose their own term project topic, even if you do NOT want to use template project to substitute the final exam. Students choosing to participate final exam CANNOT team up to work on the term project Naïve Bayes for Text Classification and Transfer Learning The goal of the standardized project for students to exercise on designing machine learning methods for text classification. Due to limited time, students are required to follow the naïve bayes for text classification to complete the term project. Please refer to [NB for Text Classification and transfer learning [Notebook, html]] notebook for code example on how to design naïve bayes classification for text classification, and transfer learning. To complete term assignment, your project must achieve following functionality. 1. Must use email.zip dataset (which include normal and spam emails) as training set, and validate the model (Naïve Bayes based models or additional model) performance on a training vs. test splitting set. 2. Must evaluate how stop words impact on the classification results 3. Must evaluate how does an NB model trained from email dataset performance on another SMS dataset (The dataset has two files: sms.csv include another short message dataset with 5572 short messages, and labels.csv include label information of each message: normal vs. spam) 4. Must have a discussion on how to improve the performance of a model trained from email.zip on SMS dataset (a typical transfer learning scenario). For students who wish to use term project to submit the final exam, the report must meet following criteria https://canvas.fau.edu/courses/115657/modules/items/3250127 • You must have a design and implementation of your method in order to outperform a simple NB model trained from email.zip to classify messages in SMS dataset • You can only use at most 5% of messages in SMS dataset for training (the rest 95% of samples must be excluded from the training phase) and the remaining messages in SMS dataset being used as test set. • You can use all messages in email.zip for training. • You can use additional dictionary or stop words provided by a third party. For example, short messages have short abbreviations, you can use those abbreviation in your model. • You must use 10 time repetitions or 10 fold-cross validation to validate and report your results. CAP 6635: Artificial Intelligence (2022 Spring) Research Report Instruction (For Students Substituting Final Exam) (10 points (term) + 15 Points (final)) Due date [May 6 2022, Firm] This instruction only applies to students who intend to use a longer version of the term project report (i.e., a research report) to substitute the final exam. If you DO wish to participate in the final exam, please follow the “Course Report Instruction (For students participating in final exam)” in the Canvas. The grading of the term project report is based on the following criteria. 1. Overall [3 pts]: You should organize your report in IEEE format, with 4,000 words minimum. Please note that table/figure do not count towards the word limitation. You can use IEEE word or Latex temperate from the following URL a. Template: http://www.ieee.org/conferences_events/conferences/publishing/templates.html. b. Plagiarism: You cannot copy any sentences, paragraphs, or figures, from any external sources (such as published papers or Internet). If Turnitin indicates that a submission is over 30% similar to any other submissions, the instructor will carry out a Plagiarism investigation. c. If you have to cite a figure/graph published somewhere else, please properly cite the source of the reference [0 credit if plagiarism check returns over 50% similarity to any published work]. [Grading of grammars and typos are included in the “Overall”] 2. Title and Abstract [1 pt]: Your report should have a brief and informative title and an abstract. The abstract should have 200-300 words, which summarizes the problem you intend to address in the report (e.g., text classification, transfer learning using naïve bayes classification etc.). Briefly describe designs and solutions which will be proposed in the report, and briefly summarize any conclusions the report intends to draw. [200-300 words] 3. Introduction [2 pts]: Your report should have an introduction section with 500 – 1000 words. The introduction should clearly state (1) what is the research problem to be studied in the report; (2) the motivation of the problem studied in your report; (3) how are the problem solved by existing methods, if any; and (4) a brief description about the method you will propose in the report. You should cite at least 8 relevant references (publications) in the introduction. [500-1000 words] 4. Related Work [2 pts]: Your report should have a related work section to summarize works (algorithms) which already exist to solve the problem. For example, if you are trying to address transfer learning for naïve bayes classification using small number of training samples, you need to discuss how transfer learning was used in naïve bayes classification, what are existing methods for http://www.ieee.org/conferences_events/conferences/publishing/templates.html short message classification, what are common approach to transfer knowledge from a source, such as email spam, to a target domain, i.e., the short message classification. etc. [500-1000 words] a. The related work section of your report must cite at least 10 peer reviewed references. 5. Main body [9 pts]: In the body of your project, you will need to provide technical details of your design [ 2000-2500 words] a. If you are proposing a new approach to transfer a naïve bayes classifier train from one dataset to be used to another dataset, you will need to describe your designs. Describe the motivation of your design, and why do you think this would work to solve the problems [2 pts] b. Use flowcharts, figures, or some pseudo-code to describe your algorithm details. [Please use at least two figures (or flowcharts) to demonstrate the system framework or architecture] [5 pts] c. If your report is about experimental studies, you will need to provide a brief description about your learning/classification methods, the benchmark datasets, and different measures applied. You should also explain how the experiments are carried out in your study, and what type of empirical study goals you intend to achieve. 6. Experiments [6 pts]: In the experiments, you need to introduce (1) main purpose of the experimental studies; (2) what are the tools used to design the algorithms; (3) what are the baseline methods for comparisons; and (4) what are the performance measures and data used for empirical studies. You should also use figures and tables to report the results collected from your studies, and summarize the experimental results [1000-1500 words]. a. Experimental settings: including an introduction of baseline methods, programming tools/languages, the setting of the parameters used for different methods. [1 pt] b. Benchmark data: Provide detailed description about data used for your study, including detailed information about the size/dimension of the data.[1 pt] c. Baseline methods: In order to demonstrate the performance of your method, you will need to use a baseline approach, and compare the performance of your design with the baseline. For example, a simple naïve bayes classifier trained from 5% of the target dataset (i.e., the SMS dataset) can be used as a baseline to demonstrate that if a classifier is trained from a small target set it might not perform very well. In this way, the experiments will demonstrate the merits of your method, and validate your hypothesis [1 pt] d. The results: The detailed results reported in figures/tables with necessary analysis and descriptions. You will need to include at least one figure and one table to show the results. [2 pts] e. Analysis of the Results: Please compare the performance of your method and the baseline approach, and analyze why your method can obtain a good performance. Please also add a case study example (e.g. an example of a review report and the predicted result from