I use BBEdit to clean up text files generated by various highlighting and note-taking apps. One application that I particularly like, Skim, exports a list of all the highlights you’ve made in a PDF, each prefaced with a line like:

* Highlight, page 74

I wanted to get rid of those line indicators, so I could incorporate those highlights into some sort of meaningful summary. Since each line has a different page number, I use a regular expression to do a grep search. The syntax that works is:

^.*Highlight, page.*$

What does each character mean?

  • ^ (a caret) indicates a match with the starting position of the string.
  • $ (a dollar sign) indicates the end of the string.
  • .* (dot followed by asterisk) matches any characters in between the beginning of the line (^) and the search term (Highlight, page)