Analyse RIS files

Reference managers like Endnote, Refworks or Zotero often allow you to export your bibliographic citations as a RIS file. You can import these into things like Talis Aspire Reading Lists.

The script below will look in the current directory for RIS files and analyse their contents. We are looking to see what types they have and how many of them have some sort of identifier that can be used to find better bibliographic data from some other source.


while IFS= read -r -d '' file
	echo -n "#=== "
	printf '%q\n' "$file"
	egrep "^TY" "$file" | sort | uniq -c 
	typeCount=$(egrep "^TY" "$file" | wc -l)
	snCount=$(egrep "^SN" "$file" | wc -l)
	echo $(($snCount*100/$typeCount))"% of records have an SN ("$snCount" of "$typeCount")"
done < <(find . \( -name "*.ris" -o -name "*.txt" \) -print0 )

Sample output:

#=== ./PMUP00DNMod3.txt
  17 TY  - CHAP
   4 TY  - JOUR
80% of records have an SN ( 17 of  21)

#=== ./PMUP00DNMod4.txt
  11 TY  - CHAP
  10 TY  - JOUR
95% of records have an SN ( 20 of  21)