February 2011
The current situation of online thievery seems to mimic the story of Ali Baba and the Forty Thieves, albeit with a more digitalized twist. In the tale, a merchant named Ali Baba comes across a secret cave where the accumulated treasures of forty thieves are hidden. The cave entrance can only be accessed by uttering the magic words “open sesame”. After secretly overhearing the thieves mention the password, Ali Baba waits for them to depart and attempts to enter the cave himself. He is successful and finds himself surrounded by valuable gold and silver. In an ironic turn of events, Ali Baba decides to pilfer some of the precious metals that the thieves have previously stolen themselves
Like the secret cave, every website contains the collected works of their respective owners, but unlike the cave in the story, most owners don’t hide their website. On the contrary, they highly advertise it by vigorously competing for the top search engine ranking. Search engine optimization (SEO) is the art and science of aligning elements in a website in order to be indexed and catalogued among the top ten in search engine rankings in relation to a specific niche. SEO has become a major source of attention for website owners hoping to make a decent living online. The higher the website’s ranking is, the greater the chances of attracting customers and visitors. Competition for the top is often aggressive, but most participants maintain an ethical and friendly attitude towards one another, choosing to make improvements in their own site rather than sabotaging others. However, this is far from a generalization. There are plenty of individuals called “black hats” that use underhanded, and oftentimes illegal, tactics in order to boost their search engine ranking. One of these methods is content theft.
Upon reaching a high level of exposure, a legitimate website is sure to fall prey to thieves, because great publicity often means valuable content is displayed out in the open. Unlike the story’s treasures which were stolen goods themselves, a website’s contents, products, and articles are the result of hard work and labor. Online thieves who wish to profit by using shortcuts don’t even need a secret password to access the gold and silver; all they need is a search engine or a web address. Once they find the targeted website, nothing can prevent them from taking what they want.
Content thieves hurt SEO in a lot of ways. If both the owner and thief’s website are high ranking sites, it splits all potential traffic into two. There are even times when the thief may get more visitors, as in the case where his site is ranked well above the original. The owner could also lose ranking since search engine crawlers are more interested in unique content rather than duplicates, increasing the chances of being eclipsed underneath all the other sites that copied the same material. Furthermore, having content stolen hurts credibility and reputation, because the owner does not have control over the quality of the websites his content may appear on. Thieves frequently do not edit the content and may even leave the owner’s original byline in place. As such, the owner’s name appearing on highly inappropriate websites is a damaging possibility.
Search engine crawlers have gotten more sophisticated in dealing with content thieves. Overtime, the crawler will notice websites with duplicated content. It will often attempt to identify the original source. If successful, all other duplicates are ranked lower or are taken out of the listings all together. However, this is not always the case, and most black hats are still capable of getting away with thievery. Although search engines can be relied upon as secondary support, the primary responsibility of catching thieves still belongs to the wronged web user. The steps below outline some of the ways used to identify and report these bandits.
Confirming the theft
The initial mode of action would be to assess whether or not a theft actually occurred. Some of the most popular methods used to ascertain a theft are as follows:
Via search engine - If the content has been lifted word-for-word, taking a sentence or phrase from the original content, adding quotation marks, and inputting it as a search engine query will usually find the culprits. The quotation marks tell the search engine to run through all websites that have been indexed and look for the exact words within the quotation marks. If the thief did not change anything from the original article, as is often the case, websites that use it will appear in the results. The problem with this method is that it is time-consuming. If the owner has hundreds of content, it would be impractical to individually search for all of them.
Via software tools - An alternative would be to use software that quickly mines the Internet for content theft and saves the user a large amount of effort compared to that of a manual search. For example, the online plagiarism checker Copyscape only requires the owner to input his URL, and the program then returns with a list of web pages that contain similar text. It also helps identify content theft that has been repackaged from old content. There are other programs available. Most are free, but some have more advanced features and require payment.
Via website monitoring – Monitoring a website through popular software such as Google Analytics will provide the owner with all the necessary information about his site. This information can be used to catch thieves. For example, thieves may copy content that still has all the attached links to it. A wise site owner will leave links within his content that would bring the visitor back to his original site. All the incoming links will be seen in the monitoring software. This enables the owner to visit all the websites that contain these links and find out if they have copied any content.
There are other methods, but these three are the most commonly used and are very effective in identifying a theft.
Removing the stolen content
Once a theft has been identified, there are a number of ways to have the stolen content removed. The following are three of the most common approaches:
Via private message - The easiest way is to send the thief a private message and request him to take down the content from his website. This method will work in most cases. The majority of thieves are usually very wary about the possible consequences they face should they refuse to remove the stolen content. A message from the source will often set these individuals straight and prevent future theft from them. However, the owner should thoroughly document all forms of evidence before attempting to send a private message. This will help him if he decides to file a complaint in the future or if the thief attempts to alter the content and feign innocence.
Via the webhost or ISP - Contacting the thief’s webhost or Internet service provider (ISP) and reporting the stolen content is a necessary step if the thief refuses to comply with a private request for removal. Any respectable webhost or ISP takes plagiarism seriously and will usually act more readily if they find that a theft has indeed occurred. They usually don’t want their company’s reputation to be tarnished by catering to online thieves, and a more dependable result will come out of the complaint. It is therefore very important that evidence was gathered prior to filing the complaint to support the plagiarism claim.
Via DMCA complaint - Lastly, a Digital Millennium Copyright Act (DMCA) complaint can be filed. Although filing the DMCA complaint may be a tedious task, it is one of the most effective methods available. Again, proper evidence gathering is indispensable. Lack of evidence or evidence that has been erased by the thief before it could be gathered will put the owner at risk for a potential countersuit. Once the claim is deemed legitimate, a takedown notice is served, thereby increasing the chances of a successful removal. However, the DMCA is a United States law and may not be valid in other territories that have their own respective legislations.
With great power, comes great responsibility. This much-loved saying applies to the Internet as well. Users who have achieved a certain degree of fame cannot escape the reality of having their content stolen from them by less ethical characters. Although it is no fault on the owner’s part, he is left to deal with the brunt of the work of serving justice. Thankfully, there are plenty of resources available to help him pursue his goal of upholding copyright laws. Although it is a painful burden shared by many legitimate website owners, they can take comfort in the fact that pursuing the perpetrators of plagiarism is helping cement the notion that Internet content theft is intolerable. Furthermore, fighting the “open sesame” mentality of thieves that choose to enter a website and make off with valuable materials ensures that a legacy of fair, legal, and original works is all that abounds in cyberspace.