Unix Global Find and Replace (using find, grep, and sed)

Finding and replacing strings and characters can be a dicey operation for a web developer. Too much can lead to breaking your web site. Too little can lead to missing that piece of HTML needing to be updated or deleted. Many tools exist that can help you in the process – Dreamweaver, Homesite, and many text editors have powerful interfaces for searching and replacing. (E.g., Use cntrl-f on windows or cmd-f on mac to see the Find and Replace window in Dreamweaver.) But there’s some really great functionality on the command line for Unix/Linux users that shouldn’t be overlooked. I’ve been experimenting with a procedure for making these global matches and replaces within the Unix shell environment and I wanted to document the process somewhere. This seems like as good a place as any…

Important: All the commands below must be run from the shell environment on a Unix or Linux system. If you aren’t sure what I’m talking about, check the wikipedia reference for shell.

Step 1: Find the pattern needing to be replaced or updated, print out files needing change

find . -exec grep ‘ENTER STRING OR TEXT TO SEARCH FOR’ ‘{}’ \; -print

*Note: I’m using the “find” and “grep” commands to search for a matching pattern which will print out a list of files and directories that need changes. If I’m at the top level of my web site the “.” in the find command will search for the pattern down through any directories below. On a large site, the process can take some time.

Step 2: Move/copy files into test directory to test expression; preserve owners, groups, timestamps

cp -p -r test test-backup

*Note: These directories would be named according to the directories or files you need to change based on the results from Step 1. The “-p” will preserve owners groups and timestamps in the copied directory. The “-r” will copy recursively down through any associated sub-directories. I do this so I can compare the new directory to the original directory after I’ve test run the global changes.

Step 3: Run test on find and replace expression in /test-backup/ directory

find . \( -name “*.php” -or -name “*.html” \) | xargs grep -l ‘ENTER STRING OR TEXT TO SEARCH FOR’ | xargs sed -i -e ‘s/ENTER OLD STRING OR TEXT TO REPLACE/ENTER REPLACEMENT STRING OR TEXT/g

*Note: I’m using the “find .” command to search for .php and html files in the current working directory as I only want to target the files that need to be touched (you should change according to your requirements), next I’m piping that result to the “grep” command which searches for the string or text specified and holds only the matched files in memory, and finally I’m passing the grep result to the “sed” command which matches the string or text and replaces it with the new string or text value.

Step 4: Test files and applications to ensure changes didn’t break functionality and that owners, groups, timestamps were preserved

Step 5: After testing, run expression from Step 3 in ALL the directories or files needing the changes. Delete /test-backup/ directory

Steps 1 and 3 are the heart of the matter. I’m learning the power of these commands, so I’m pretty cautious about backing up and testing on directories and files that aren’t live. Once I have the expression dialed in, I’ll run it on a more global scale. So, there you have it – my find and replace process in a nutshell. Use at your own discretion and feel free to share your thoughts in the comments.


Computers in Libraries 2008 – Trends and Highlights

It’s been just over a week since I returned from Computers in Libraries. The InfoToday crew usually puts together a nice set of speakers and ideas and this year was more of the same. I’m not even going to try and summarize everything – check LibrarianInBlack for some of the best summaries or the Technorati CIL2008 tag to follow along from home. As I mentioned in a previous post, I arrived as the conference was winding down, but I was still able to pick up some tips, teach a couple of workshops, and spot some library trends. I’m going to focus on the trends part as the week has given me some time to collect my thoughts. So, here they are, the library trends I’m seeing (based on CIL’s programming and my interests).

Twitter and Libraries – Microblogging and its associated platforms are starting to be noticed and utilized in some library settings. Right now, the emphasis is on connecting to friends, but as more info gets shared in 140 character bits – the Twitter channel is becoming a resource. It’s all about following keywords and topics and choosing your Twitter network of followers (aka “friends”) Pownce and Tumblr are two similar services to watch.

Web Services for Everybody – When Yahoo Pipes came on the scene about a year ago, I wondered when it might start showing up as a tool for library mashups. It’s actually a pretty simple way to use web services in a graphical user interface. Pipes seems limited to pretty simple formats (RSS and ATOM) generally, but it introduces the power and concepts behind web services intuitively. In the long term, it’s still best to learn web services, XML, and some scripting for truly robust mashups, but Pipes lowers the bar for entry in a really nice way.

The Portable Library – At MSU libraries, we’ve been looking at ways to bring library resources into a user’s networked environment. See our widgets and tools for some examples. It was great to meet other developers and libraries pursuing similar efforts. I got to have an extended discussion with Binky Lush, lead web developer at Penn State University Libraries, about her efforts to place library web services into users’ working environments. It’s refreshing to see some of these attempts to move away from the gatekeeping model of web sites as single points of service. Leveraging the network and learning to broadcast bits and pieces of the library into multiple web spaces – iGoogle, iTunes, Course Management Systems – will be an important move for libraries going forward.

Open Source and Learning Outside the Profession – Open source solutions for libraries are becoming easier to implement, but it was nice to see the balanced conversation and practical examples of Open source possibilities for libraries that were part of the “Open Source” track moderated by Nicole Engard. I wasn’t able to see the “Beyond Libraries: Industries Using Hot Tech”, but the idea of looking outside our comfort zone and learning from other industries really resonates with me as an essential trend to follow. Steven Cohen (the track organizer) is onto something here. Innovation frequently happens elsewhere; let’s hear more about it.


Ajax and Web Services Workshops

I came a bit late to the Computers in Libraries 2008 party by arriving on Tuesday night, but I was stilI able to catch up with a few people and make some new friends. It was interesting presenting to a group as they were eating lunch (never done that one before), but the presentations went well. I also had a great time teaching the workshops yesterday. For those that are interested, all the files and code samples from my talks are available below.

Workshop: “Web Services for Libraries.”
Computers in Libraries, Crystal City, VA, 10 April 2008.
http://www.lib.montana.edu/~jason/talks/cil2008-workshop-webservices.pdf
http://www.lib.montana.edu/~jason/talks/cil2008-workshop-webservices-handout.pdf

Workshop: “Ajax (Asynchronous Javascript and XML) for Libraries.”
Computers in Libraries, Crystal City, VA, 10 April 2008.
http://www.lib.montana.edu/~jason/talks/cil2008-workshop-ajax.pdf
http://www.lib.montana.edu/~jason/talks/cil2008-workshop-ajax-handout.pdf

Cybertour: “Next-Generation Digital Libraries.”
Computers in Libraries, Crystal City, VA, 09 April 2008.
http://www.lib.montana.edu/~jason/talks/cil2008-session-diglib.pdf

Cybertour: “What To Do When Interface Design Goes Bad.”
Computers in Libraries, Crystal City, VA, 09 April 2008.
http://www.lib.montana.edu/~jason/talks/cil2008-session-interface.pdf