|

Automation and Crime – Limitations and Penalties of Web Crawling in South Korea

As big data and artificial intelligence (AI) technologies continue to advance, the importance of ‘web crawling’ – collecting and analyzing vast amounts of data from the web – is growing daily. From search engines to price comparison sites and market analysis reports, various services rely on crawling technology. However, behind the convenience lies the shadow of legal disputes. Many businesses have been particularly concerned about the extent to which crawling a competitor’s data is permitted under the law.

Amid this controversy, the South Korean Supreme Court recently issued a final ruling (Supreme Court Decision 2022. 5. 12. 2021Do1533) on a data crawling case between competing accommodation booking platforms. This judgment holds significant importance as it represents the Supreme Court’s first decision on criminal liability for data collection through web crawling.

1. Case Overview: Web Crawling a Competitor’s Data

This case involves executives and employees of Company A (the defendant company), which operates an accommodation information and reservation service, who were prosecuted for allegedly collecting data without authorization from the mobile app (‘Direct Booking’) or PC website of Company B (the victim company).

Key Elements of the Prosecution’s Case:

  • Information Acquisition: The defendants used packet capture programs to discover the victim company’s app program source, API server information (modules, URL addresses, command syntax, etc.).
  • Abnormal Access: Using the discovered information, they directly accessed the victim company’s API server from a PC, disguising themselves as normal users.
  • Development and Use of Crawling Programs: They developed and used a crawling program that retrieved information about all accommodation establishments within a specific location (1000km radius from the defendant company). (Normally, the victim’s app only provided information within a 7-30km radius of the user’s location).
  • Unauthorized Data Replication: For approximately four months (June 1 – October 3, 2016), they accessed the API server once or twice daily using a crawling program and copied information such as affiliated accommodation names, addresses, and room names without authorization.

Charges Filed:

  • Violation of the Act on Promotion of Information and Communications Network Utilization and Information Protection (network intrusion)
  • Copyright Act violation (infringement of database producer rights)
  • Criminal law violation for computer obstruction of business

Conflicting Lower Court Decisions:

First Instance Court (Guilty):

  • (Network Intrusion) The court recognized intrusion based on the fact that the victim company did not disclose API information, prohibited the use of automatic connection programs in its terms of service, and the defendants continued to access by changing IPs despite IP blocking.
  • (Copyright Violation) The court determined that the database was repeatedly and systematically copied 264 times over six months.
  • (Business Obstruction) The court found that requesting nationwide information beyond the normal search range of the app generated mass calls and obstructed business operations.

Appellate Court (Not Guilty):

  • (Network Intrusion) The court determined that no membership registration or password was required for API access, packet capture was ordinary, there were no technical measures to hide the API server URL or block access, the information taken was public, expanding the search range did not exceed access rights, and terms of service applied to members only. IP blocking was merely a detection of repeated access, not a prohibition of all access.
  • (Copyright Violation) The information collected (3-8 items) did not constitute a “substantial portion” compared to the entire database (about 50 items), and the content was mostly public information accessible through the app.
  • (Business Obstruction) The API server was designed to return information according to commands; requesting information within the allowed command syntax was not “inputting false information or improper commands,” and server outage dates could coincide with periods of increased normal usage.

With such stark differences between the first instance and appellate court rulings, attention focused on the Supreme Court’s final judgment.

2. What is Web Crawling and What Legal Issues Does It Raise?

To understand the Supreme Court’s decision, let’s first examine web crawling technology and related legal issues.

The Concept of Web Crawling:

Web crawling uses ‘crawler’ robots (software) to automatically explore and collect web page data from the internet. Starting from a specific web address (URL), it downloads and stores data by following links in a page sequentially.

  • Difference from Web Scraping: While scraping focuses on extracting and processing specific information from web page screens, crawling is closer to collecting web pages themselves and gathering extensive data by following links. However, in practice, these terms are often used interchangeably for data collection purposes.
  • Distinction from Hacking: Crawling generally involves accessing ‘publicly available’ servers to retrieve information, distinguishing it from hacking, which illegally infiltrates systems or alters data. (This Supreme Court ruling is significant as it legally confirms for the first time that crawling does not constitute ‘hacking’ in the form of network intrusion.)

Applications of Web Crawling:

  • Search Engines: Google, Naver, and others use crawlers to collect and index global web information for search services.
  • Price Comparison: Crawling product information from various shopping malls to compare prices.
  • Data Analysis: Crawling news articles, social media posts, etc., to analyze social trends and public opinion.
  • Market Research and Competitive Analysis: Collecting publicly available price and service information from competitors to develop business strategies.

Legal Regulations and Disputes Related to Web Crawling:

Most websites allow basic crawling for search engine exposure. However, disputes arise when competitors crawl large amounts of data without authorization for commercial purposes.

  • Technical Prevention Measures (robots.txt): Website operators can restrict or allow specific crawlers through the `robots.txt` file. However, this is merely a non-mandatory recommendation, and not all crawlers comply with it.
  • Evolution of Legal Issues: While initial concerns centered on personal information violations, as data has become a core asset, discussions from a competition law perspective (whether denying data access restricts market competition, or whether crawling itself is an unfair competitive practice) have become more active.
  • Criminal Law Interest: There has been relatively little discussion about the possibility of criminal penalties. This Supreme Court decision is notable for directly addressing the issue of criminal liability for crawling activities.

3. The Supreme Court’s Decision: Is Web Crawling Subject to Criminal Liability?

The Supreme Court decision (2021Do1533) held that the appellate court was correct in finding the defendants not guilty of all three charges. The specific reasoning for each charge is as follows:

Information Network Intrusion (Violation of Information and Communications Network Act): “Access Rights Must Be Objectively Determined”

Key Issue: Did the defendants have ‘legitimate authority’ to access the victim company’s API server?

Supreme Court’s Standard: The existence of ‘access rights’ in information network intrusion should be determined not by the service provider’s subjective intent, but by carefully considering ‘objectively demonstrated circumstances’ such as ① whether technical protection measures existed to block access, and ② whether terms of service clearly specified access methods or permitted ranges. The Court presented this new legal principle, suggesting that given the open nature of the internet, access rights should not be hastily denied if clear intent to restrict is not objectively indicated.

Application to the Case:

  • The victim company’s API server URL could be easily discovered through packet capture, and there were no separate authentication procedures or technical protection measures to block access.
  • Although terms of service prohibited automatic connection programs, these appeared to apply to members, and it was difficult to interpret them as prohibiting API server access itself.
  • The victim company’s IP blocking was merely a technical measure in response to mass calls, and it was difficult to consider this an objective indication of intent to prohibit all access through IPs other than the blocked one.

Database Producer Rights Infringement (Copyright Act Violation): “Not a ‘Substantial Portion’ of the Database”

Key Issue: Did the information copied by the defendants constitute ‘all or a substantial portion’ of the victim company’s database?

Supreme Court’s Standard: Whether something is a ‘substantial portion’ should be comprehensively considered from both ‘quantitative’ aspects (compared to the entire database size) and ‘qualitative’ aspects (the importance of that portion in terms of investment or effort required to build the database).

Application to the Case:

  • The information collected by the defendants comprised only 3-8 out of approximately 50 items, making it difficult to consider it ‘quantitatively’ substantial.
  • The collected information (establishment names, addresses, prices, etc.) was mostly information publicly disclosed by the victim company for business purposes or easily accessible through normal app use, so its ‘qualitative’ importance was also deemed low.

Computer Obstruction of Business (Criminal Law Violation): “Not an ‘Improper Command’ Input”

Key Issue: Did the defendants’ input of extensive search commands beyond the normal range of the app constitute an ‘improper command’ input, and did this cause server outages that obstructed business?

Supreme Court’s Standard: An ‘improper command’ refers to commands that contradict the normal intended use and method of a system. This too should be determined based on objectively demonstrated system allowances rather than the administrator’s subjective intent.

Application to the Case:

  • The victim company’s API server was fundamentally designed to return information according to given command syntax and did not place explicit restrictions on search radius or similar parameters.
  • Therefore, the defendants’ requests for information by setting a wider search range within the allowed command syntax format could not be considered ‘improper commands’ contrary to the system’s purpose.

4. Implications and Significance of the Ruling

This Supreme Court decision has established important legal standards regarding criminal liability for web crawling.

  1. Emphasis on ‘Objective Circumstances’: When determining whether network intrusion or business obstruction crimes have been committed, the Court clarified that access rights or the impropriety of commands should be judged based on objective externally demonstrated circumstances such as technical protection measures and explicit terms of service, rather than the service provider’s subjective intent.
  2. Highlighting the Importance of Technical Protection Measures: If website or API server operators wish to restrict data access, it has become important to implement substantive technical access control measures beyond simply including provisions in terms of service.
  3. Clarification of Database ‘Substantiality’ Criteria: The Court reaffirmed that the meaning of a ‘substantial portion’ in database rights infringement under copyright law should be considered comprehensively from both quantitative and qualitative perspectives.
  4. Not a Blanket Exemption for All Crawling: This ruling does not give immunity to all types of web crawling. If there are strong technical protection measures that are bypassed, or if non-public information or substantial and critical portions of a database are taken without authorization, criminal liability may still be recognized.

5. Conclusion

The Supreme Court Decision 2021Do1533 provides an important milestone regarding the legal permissibility of web crawling, particularly on the issue of criminal liability, in this era where it has become an essential technology in the data economy. It requires service providers to take clearer and more objective measures for data protection, while also establishing boundaries for companies utilizing crawling technology to avoid unduly infringing on others’ information network stability or database rights.

Legal discussions surrounding web crawling will continue to evolve with technological advancements. This decision serves as a catalyst for efforts to find a balance between free data use and innovation promotion, and the protection of information subjects’ rights and service stability.

K&P Law Firm has recently successfully represented clients in web crawling disputes between companies in South Korea, possessing specialized expertise in legal risk analysis and response strategy formulation concerning data collection and usage by IT companies. If you are struggling with legal issues related to web crawling, we are ready to provide consultation at any time.

Similar Posts