Selenium WebDriver

Broken Link Detection

Learn to identify and validate broken links in web applications using HTTP response codes and URL connection testing.

Understanding Broken Links

What are Broken Links?

Broken links (also called invalid links) are hyperlinks that don't work properly - they either lead to non-existent pages, return server errors, or fail to load due to various issues.

Impact on Users

  • • Poor user experience
  • • Loss of credibility
  • • Reduced engagement
  • • Navigation frustration

SEO Impact

  • • Lower search rankings
  • • Reduced crawl efficiency
  • • Negative site quality signals
  • • Lost link equity

HTTP Response Code Classification

Response Code Categories

HTTP response codes are grouped into series that indicate different types of responses from the server.

✅ Valid Links (Response < 400)

1** Series - Informational
Provides information about the URL
Example: 100 Continue, 101 Switching Protocols
2** Series - Success
Request was successful
Example: 200 OK, 201 Created, 204 No Content
3** Series - Redirection
Navigates from one URL to another
Example: 301 Moved Permanently, 302 Found

❌ Invalid Links (Response ≥ 400)

4** Series - Client Errors
Client-side error codes
Example: 404 Not Found, 403 Forbidden, 400 Bad Request
5** Series - Server Errors
Server-side error codes
Example: 500 Internal Server Error, 502 Bad Gateway

📝 Quick Rule

• Response code < 400 = Valid link
• Response code ≥ 400 = Invalid/Broken link

Single Link Validation

6-Step Process for Link Validation

Learn the systematic approach to validate individual links using Java's URL and HttpURLConnection classes.

Step 1: Create URL Object
URL url = new URL("https://www.facebook.com");
Step 2: Open Connection
URLConnection urlCon = url.openConnection();
Step 3: Type Cast to HttpURLConnection
HttpURLConnection httpCon = (HttpURLConnection)urlCon;
Step 4: Connect to URL
httpCon.connect();
Step 5: Get Response Code
int responseCode = httpCon.getResponseCode();
Step 6: Validate Link
if(responseCode < 400) // Valid else // Invalid

Complete Single Link Example

package Tutorial18;

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLConnection;

public class Demo1 {
    public static void main(String[] args) throws IOException {

        // Step 1: Create URL object
        URL url = new URL("http://www.deadlinkcity.com/error-page.asp?e=404");

        // Step 2: Open connection
        URLConnection urlCon = url.openConnection();

        // Step 3: Type cast to HttpURLConnection
        HttpURLConnection httpCon = (HttpURLConnection) urlCon;

        // Step 4: Connect
        httpCon.connect();

        // Step 5: Get response code
        int responseCode = httpCon.getResponseCode();

        // Step 6: Validate link
        if (responseCode < 400) {
            System.out.println("Link is valid");
        } else {
            System.out.println("Link is invalid");
        }
    }
}

Knowledge Check

Knowledge Check

Question 1 of 5

What HTTP response code series indicates valid links?

Key Points Summary

🎯 Link Validation Essentials

  • • Response codes < 400 = Valid links
  • • Response codes ≥ 400 = Broken links
  • • Always validate HTTP/HTTPS links only
  • • Handle null and empty href attributes

⚡ Implementation Tips

  • • Use try-catch for robust error handling
  • • Set appropriate connection timeouts
  • • Filter links before validation
  • • Implement proper logging mechanisms

📈 Scalability Considerations

  • • Use thread pools for parallel processing
  • • Implement rate limiting for large sites
  • • Cache results to avoid duplicate checks
  • • Consider using headless browsers

🚨 Common Pitfalls

  • • Not handling relative URLs properly
  • • Ignoring JavaScript-generated links
  • • Missing timeout configurations
  • • Not considering authentication requirements

SDET Mastery

Master Test Automation

Home
CurriculumPracticeQ&ACheatsheet
🍵Buy me a Chai

Automation Testing Course

Comprehensive course covering Manual Testing, Java Programming, and Selenium WebDriver

🍵Buy me a Chai
Privacy Policy•GitHub
© 2024 Automation Testing Course. All rights reserved.