CS 598 Computational Scientometrics (Fall 2022)

Instructors: George Chacko & Tandy Warnow

Computational ScientometricsGGC43665S1040930 – 1045M W  Online

Course Update(s):

Given the weather, office hours for both instructors will be virtual from now on.

Objectives: This graduate course is centered around applying quantitative analytical techniques to problems in scientometrics that concern research metadata, particularly citations. The course consists of presentations, critical discussions, and research projects. Participating students will explore scientific questions, analyze data, develop new methods or apply existing ones. Lectures and classes will be online. Office hours will be a combination of in-person and virtual. This course has a required in-person activity– an in-person presentation of your course project proposal. You will need to schedule this event with the instructors, and it will not take place during the class lecture but can occur during in-person office hours.

Overview: The term scientometrics, while accommodating a plurality of perspectives, generally refers to quantitative science studies. One definition (and there are others) can be found here. A history of the field can be found in the essays and articles in the Garfield Library at UPenn. As examples, five research articles in scientometrics are listed below. All five will be critically discussed during this course.

Emphasis will be placed on interdisciplinary perspectives, the use of open source computing tools, and publicly available data. The course will feature guest speakers who bring unique perspectives and experiences with them. Critical discussion of research literature will be coupled to designing and executing a required research project.

Students will be evaluated with respect to their level of engagement in the class, satisfactory completion of homework assignments, and the quality of their presentations, draft, and final project reports. Note that there is a required in-person presentation of the course project proposal. You will need to schedule this with the instructors, and these presentations will not take place during the class lectures.

Students are encouraged to publish results from these projects. Examples of publications that resulted from projects with envisioned levels of effort can be found here. For those students interested in expanding their course project into a publication, the instructors will help them develop and improve their work, and finally to submit and publish research findings in journals or conferences.  

  • Class Presentation: 20%
  • Course Homework Assignments: 30%
  • Class participation: 10%
  • Course Project: 40%

Who should take the course: The course is designed for graduate students in Computer Science, ECE, or Statistics. However, it is open, with permission of the instructors, to graduate students from other programs as well as advanced undergraduates. Interested students are urged to contact the instructors before registering.

Minimum Required Skills:  The ability to retrieve data from relational databases and APIs and analyze these data in tabular and graph formats. Intermediate programming ability, familiarity with statistical analysis and Linux environments.

Invited Lectures

  • Sep 07: Henry Small Can information science be applied to the history and philosophy of science?
  • Sep 12: Daniel Gusfield Two Theoretical Topics in Clustering.
  • Sep 14: Vincent Traag Structure of disagreement in science.
  • Sep 19 Daniel Hook and Simon Porter Democratizing Analysis: Changing the relationship between between scientometric data and its users
  • Sep 21: David Sepkoski Scientometrics and the History of Science. To be rescheduled
  • Sep 28: Kevin Boyack Global Models of Science – Relatedness, Clustering and Characterization.
  • Oct 03: Martina Iori The complexity of new knowledge: knowledge recombination in scientific articles.
  • Oct 05: Srijan Sengupta Core periphery structure in networks: a statistical perspective.

Course Project:

  • Enrolled students are encouraged to work in teams of two. If you wish to work alone, please consult with the instructors.
  • Initial Proposal: Due by midnight US Central on Sep 15. Email the proposal to chackoge@illinois.edu The initial proposal should be in PDF and run no longer than one page of text. The proposal should state and explain: (i) the question being asked, (ii) how answering the question would contribute to the body of existing knowledge, (iii) the approach to be used, and (iv) how results will be interpreted.
  • Final Proposal: Due by midnight US Central on Oct 15. Email the proposal to the instructors. The final proposal should reflect iterative discussion with the instructors between Sep 15 and Oct 15 and is expected to be in the order of 5-10 pages of text but may exceed this length.
  • Final Report: Due by midnight Dec 7, 2022. For more details of grace period and penalty for late submission click on this link. The report should conform to the format expected of a research article published in a respectable scientific journal. 20-30 pages is a ballpark estimate for document size.

Office Hours

  • Chacko: Tuesdays 11 am – 11:30 am (virtual), Fridays 9-9:30 am (virtual)
  • Warnow: Fridays 1-2 pm (virtual)

Leave a Reply

Your email address will not be published. Required fields are marked *