Thursday, July 17, 2008
Various online book searches could keep more "personal information" in play
One area of some controversy, at least with some book authors, has been the indexing of the contents of books in various ways by search engines. Some cooperative and self-publishers (like iUniverse) offer Partner Programs with Google, and books that were published after a certain date (I think that’s 2002 for iUniverse) usually are placed in these programs automatically. The end results is that some passages from the content of the book become available for searching much the way other online static text essays on the Internet area, and follow some kind of proprietary algorithm for relevance to search arguments. Some self-published authors (as I have done) will, in time, make portions of their text available separately in normal HTML or PDF files for free viewing (it’s also possible to set up e-books or Kindle, and a few years ago there were experiments with a process called Softlock). That may duplicate the text, which is considered undesirable by search engines but generally causes no problem. The author can correct any errors, misspellings, or in rare cases remove names on request related to ambiguous “reputation defense” concerns of others over the long period of time of a book’s existence. With a copy indexed from the publisher’s copy, changes generally would not be possible.
Amazon.com also offers “search inside the book” on many books, and it seems there is a focus now on doing this with books not in other partner programs. It’s possible to find any text within such a book locally, but that does not cause it to be found randomly by search engines. The visitor would need to know or believe that a particular name or sequence of words is there to begin with. The capability is analogous to physically looking at a hardcopy in a “bricks and mortar” bookstore, except that the computer can find the words within the book without a physical page-by-page search, so perhaps this has some significance.
Before the “revolution” in lower cost desktop publishing – it was starting in the 90s well before the Internet became a repository for publishing and social networking – major book publishers tended to practice extreme due diligence before putting opuses out. That practice would certainly have changed, as we now have a world in which there are orders of magnitude more speakers than there had been a couple decades ago, and where speakers are less well informed on intellectual property law and on the unintended effect that their speech could have on others over time. That is a situation that, partly because of the rapid increase in the effectiveness of search engines in the late 1990s, was growing even before social networking sites and Web 2.0 became all the rage, and then, perhaps to the surprise even of the younger entrepreneurs who set up many of the web companies, employers (and to some extent other agents like law enforcement) started using the Web for investigatory purposes without a lot of care about the accuracy of the information they found on subjects. However, the media did not start to report widely on this problem until it came up in conjunction with social networking profiles and blogs, starting late in 2005.
As I have said before, it’s important that the well-established Human Resources world develop “best practices” for use of various Internet and other investigatory tools to check applicants and existing employees, and that they inform their stakeholders openly about their policies.