{"id":192,"date":"2025-11-24T00:57:20","date_gmt":"2025-11-24T00:57:20","guid":{"rendered":"http:\/\/ijeesoo.com\/?page_id=192"},"modified":"2025-11-24T02:27:48","modified_gmt":"2025-11-24T02:27:48","slug":"link-analysis","status":"publish","type":"page","link":"http:\/\/ijeesoo.com\/?page_id=192","title":{"rendered":"Link Analysis"},"content":{"rendered":"\n<div class=\"wp-block-group alignfull has-global-padding is-layout-constrained wp-container-core-group-is-layout-490fa1db wp-block-group-is-layout-constrained\" style=\"padding-top:var(--wp--preset--spacing--50);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--50);padding-left:var(--wp--preset--spacing--50)\">\n<div class=\"wp-block-group has-global-padding is-layout-constrained wp-container-core-group-is-layout-650790e1 wp-block-group-is-layout-constrained\">\n<h1 class=\"wp-block-heading has-text-align-center has-x-large-font-size\">Why Links Matter<\/h1>\n\n\n\n<div style=\"height:1.25rem\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Core Principle:<\/strong>Links are not just navigation paths; they are <strong>implicit endorsements<\/strong> and <strong>votes of confidence<\/strong>.<\/li>\n\n\n\n<li><strong>Structural Signals:<\/strong>They provide a rich structural layer that complements content (keyword) analysis.<\/li>\n\n\n\n<li><strong>Authority &amp; Quality:<\/strong>Link analysis allows search engines to quantify the <em>importance<\/em> and <em>quality<\/em>of a page.\n<ul class=\"wp-block-list\">\n<li>A link from an authoritative source transfers trust and relevance.<\/li>\n\n\n\n<li>Crucial for solving the <strong>Quality\/Authority<\/strong>problem in search ranking.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<div style=\"height:1.25rem\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-group has-global-padding is-layout-constrained wp-container-core-group-is-layout-650790e1 wp-block-group-is-layout-constrained\">\n<h1 class=\"wp-block-heading has-text-align-center has-x-large-font-size\">Historical Context: A Breakthrough<\/h1>\n\n\n\n<div style=\"height:1.25rem\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Pre-Link Analysis (The Past):<\/strong>Ranking was based almost entirely on term frequency, density, and content matching. This was highly vulnerable to keyword stuffing and manipulation.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>The Breakthrough (Late 1990s):<\/strong>The introduction of link analysis (specifically PageRank and HITS) revolutionized ranking.<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>The Change:<\/strong>It introduced an <strong>intrinsic, query-independent measure of importance<\/strong> that was difficult to artificially inflate, leading to significantly better search quality.<\/li>\n<\/ul>\n\n\n\n<div style=\"height:1.25rem\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-group has-global-padding is-layout-constrained wp-container-core-group-is-layout-650790e1 wp-block-group-is-layout-constrained\">\n<h1 class=\"wp-block-heading has-text-align-center has-x-large-font-size\">The Web as a Graph<\/h1>\n\n\n\n<div style=\"height:1.25rem\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Definition:<\/strong>The entire World Wide Web can be modeled as a massive, directed graph.<\/li>\n\n\n\n<li><strong>Nodes (Vertices):<\/strong>Represent individual web pages, documents, or resources (e.g., a specific URL).<\/li>\n\n\n\n<li><strong>Edges (Directed Links):<\/strong>Represent the hyperlinks connecting the pages.\n<ul class=\"wp-block-list\">\n<li>The connection is <strong>directed<\/strong>: A link from A to B is not the same as a link from B to A.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Structure:<\/strong>The web graph is characterized by its enormous <strong>scale<\/strong>, high <strong>sparsity<\/strong>(few links per page), and complex <strong>connectivity<\/strong>.\n<ul class=\"wp-block-list\">\n<li><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignfull has-base-2-background-color has-background has-global-padding is-layout-constrained wp-container-core-group-is-layout-669513ed wp-block-group-is-layout-constrained\" style=\"margin-top:0;margin-bottom:0;padding-top:var(--wp--preset--spacing--50);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--50);padding-left:var(--wp--preset--spacing--50)\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-52b864f0 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Introduction to Link Analysis<\/h2>\n\n\n\n<div style=\"height:0px\" aria-hidden=\"true\" class=\"wp-block-spacer wp-container-content-4f2ccadb\"><\/div>\n<\/div>\n\n\n\n<div style=\"margin-top:0;margin-bottom:0;height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d0bbbce0 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">The Web as a Graph<\/h3>\n\n\n\n<p class=\"has-text-align-left\">7<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">PageRank<\/h3>\n\n\n\n<p class=\"has-text-align-left\">18<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">Hubs and Authorities (HITS Algorithm)<\/h3>\n\n\n\n<p class=\"has-text-align-left\">6<\/p>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--20)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d0bbbce0 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">Hubs and Authorities (HITS Algorithm)\u2014Choosing the Subset of the Web<\/h3>\n\n\n\n<p class=\"has-text-align-left\">3<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">Advanced Link-Based Models<\/h3>\n\n\n\n<p class=\"has-text-align-left\">6<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">Link Analysis Beyond Web Search<\/h3>\n\n\n\n<p class=\"has-text-align-left\">3<\/p>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d0bbbce0 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">Modern Trends in Link Analysis<\/h3>\n\n\n\n<p class=\"has-text-align-left\">3<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-container-core-column-is-layout-e3d1c41b wp-block-column-is-layout-flow\">\n<h3 class=\"wp-block-heading has-text-align-left is-style-asterisk has-body-font-family has-medium-font-size\" style=\"font-style:normal;font-weight:600\">Evaluation and Critiques<\/h3>\n\n\n\n<p class=\"has-text-align-left\">3<\/p>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignfull has-global-padding is-layout-constrained wp-container-core-group-is-layout-d89aad35 wp-block-group-is-layout-constrained\" style=\"margin-top:0;margin-bottom:0;padding-top:var(--wp--preset--spacing--50);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--50);padding-left:var(--wp--preset--spacing--50)\">\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">The Web as a Graph<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Structural Properties: Scale and Sparsity<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Scale (Massive Size):<\/strong>The graph contains billions of nodes (pages) and trillions of edges (links). Processing this scale requires distributed computing (e.g., MapReduce).<\/li>\n\n\n\n<li><strong>Sparsity:<\/strong>The transition matrix representing the graph is extremely sparse.\n<ul class=\"wp-block-list\">\n<li>Sparsity means the number of actual links is tiny compared to the total possible number of links between all pages.<\/li>\n\n\n\n<li>This property is key for efficient storage and computation.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Structural Properties: Connectivity<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Bow-Tie Structure (Connectivity):<\/strong>Research shows the Web is not uniformly connected but resembles a &#8220;bow tie.&#8221;\n<ul class=\"wp-block-list\">\n<li><strong>SCC (Strongly Connected Component):<\/strong>A large core where every page is reachable from every other page.<\/li>\n\n\n\n<li><strong>IN Component:<\/strong>Pages that link <strong>into<\/strong>the SCC.<\/li>\n\n\n\n<li><strong>OUT Component:<\/strong>Pages that are linked <strong>out of<\/strong>the SCC.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Power-Law Distribution:<\/strong>The distribution of incoming links (in-degree) follows a power law.\n<ul class=\"wp-block-list\">\n<li>A very few pages (hubs like Wikipedia, major news sites) have a disproportionately large number of incoming links.<\/li>\n\n\n\n<li>Most pages have very few incoming links.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Directed vs. Undirected Graphs<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Directed Graph (The Reality):<\/strong>The Web is inherently directed.\n<ul class=\"wp-block-list\">\n<li>A link from Page A to Page B ($A \\to B$) does not imply a reciprocal link ($B \\to A$).<\/li>\n\n\n\n<li>Link analysis algorithms like PageRank rely entirely on this directionality to model the flow of authority.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Undirected View (For Modeling):<\/strong>Occasionally, the graph can be treated as undirected to analyze relationships like co-citation or similarity, but never for authoritative ranking.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Components and Problems<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Strongly Connected Components (SCCs):<\/strong>Sets of pages where you can start at any page and reach any other page within the set by following links. Pages within an SCC share and circulate authority effectively.<\/li>\n\n\n\n<li><strong>Dangling Nodes (Sink Nodes):<\/strong>These are pages that have incoming links but <strong>no outgoing links<\/strong>.\n<ul class=\"wp-block-list\">\n<li><strong>Problem:<\/strong>In the random surfer model (used by PageRank), the surfer gets &#8220;stuck&#8221; here, leading to non-convergence or rank leakage.<\/li>\n\n\n\n<li><strong>Solution:<\/strong>Algorithms handle these nodes by assuming a jump (teleportation) to a random page.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Anchor Text: Definition and Role<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Definition:<\/strong>Anchor text is the clickable, visible text within a hyperlink tag.\n<ul class=\"wp-block-list\">\n<li>Example: &lt;a href=&#8221;www.target.com&#8221;>**This is the Anchor Text**&lt;\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Role:<\/strong>Anchor text acts as a <strong>label or description<\/strong>provided by the <em>linking page<\/em>to describe the content of the <em>target page<\/em>.<\/li>\n\n\n\n<li><strong>Why it&#8217;s powerful:<\/strong>It is often a concise, human-generated summary of the target page&#8217;s content, which can be more accurate than the target page&#8217;s own title or metadata.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Anchor Text as a Relevance Signal<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Aggregation:<\/strong>Search engines aggregate all anchor text pointing to a single page.\n<ul class=\"wp-block-list\">\n<li>This aggregated text is treated as a highly trusted set of keywords associated with the target page.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Relevance:<\/strong>It is a powerful signal for query relevance. If hundreds of external sites use the phrase &#8220;best travel guides&#8221; in their anchor text when linking to a specific blog, that blog is highly relevant for that query.<\/li>\n\n\n\n<li><strong>Combining Signals:<\/strong>Anchor text is one of the most effective ways to combine the content of one page (the anchor) with the authority of the target page.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Retrieval and Ranking Examples<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Example 1: Wikipedia:<\/strong>Wikipedia pages are highly authoritative (high PageRank) and receive links with descriptive anchor text (e.g., &#8220;Battle of Hastings&#8221;). This combination makes them rank highly for relevant, specific queries.<\/li>\n\n\n\n<li><strong>Example 2: Corporate Pages:<\/strong>A company&#8217;s homepage might not contain the phrase &#8220;Investor Relations,&#8221; but if all financial news sites link to it using that phrase as anchor text, the page will rank for that query.<\/li>\n\n\n\n<li><strong>Ranking Mechanism:<\/strong>The anchor text terms are treated as if they were part of the target page&#8217;s document index, allowing content-based search engines to leverage link structure indirectly.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">PageRank<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">PageRank: The Core Intuition<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Definition:<\/strong>PageRank (PR) is an algorithm that measures the <strong>intrinsic importance<\/strong> or <strong>authority<\/strong> of a web page based on its link structure.<\/li>\n\n\n\n<li><strong>Intuition: &#8220;Votes&#8221; are Weighted:<\/strong>A page is important if it is linked to by many pages, and especially if it is linked to by many <strong>important<\/strong> pages.\n<ul class=\"wp-block-list\">\n<li>A link from CNN is worth far more than a link from a new personal blog.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Recursive Definition:<\/strong>The rank of a page depends on the ranks of the pages linking to it. This recursive dependency forms the basis of the calculation.<\/li>\n\n\n\n<li><strong>Core Principle:<\/strong>A page gains a high rank if it is linked to by many pages that <em>already<\/em>have high rank. Links act as endorsements, and the value of that endorsement is proportional to the source page&#8217;s own rank.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Simple PageRank<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"521\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-9-1024x521.png\" alt=\"\" class=\"wp-image-212\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-9-1024x521.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-9-300x153.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-9-768x391.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-9.png 1130w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">PageRank (Suffers &#8220;Rank Sink&#8221; Problem)<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Flaw: Rank Sink Problem:<\/strong>The Simple PageRank algorithm suffers from a critical flaw when dealing with pages that have no outgoing links (Dangling Nodes).\n<ul class=\"wp-block-list\">\n<li><strong>Mechanism:<\/strong>If a page only points to itself (or simply has no out-links), the rank it accumulates is <strong>trapped<\/strong>at that node and cannot be redistributed to the rest of the web.<\/li>\n\n\n\n<li><strong>Consequence:<\/strong>This &#8220;rank sink&#8221; causes the total rank across the entire web graph to gradually drain away, leading to a system where the total rank is no longer conserved, and the ranks of other pages are unfairly reduced.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Teleportation<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"441\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-10-1024x441.png\" alt=\"\" class=\"wp-image-213\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-10-1024x441.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-10-300x129.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-10-768x331.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-10.png 1514w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">The Random Surfer Model<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"374\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-11-1024x374.png\" alt=\"\" class=\"wp-image-215\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-11-1024x374.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-11-300x110.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-11-768x281.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-11.png 1338w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Importance of Link-Based Ranking<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Query-Independent Authority:<\/strong>PageRank calculates a score for every page <strong>before<\/strong>a user enters a query. This score measures the page&#8217;s general, global authority.<\/li>\n\n\n\n<li><strong>Separation of Concerns:<\/strong>It cleanly separates the concepts of:\n<ul class=\"wp-block-list\">\n<li>\u2022<strong>Authority (Quality):<\/strong>Measured by PageRank (link structure).<\/li>\n\n\n\n<li>\u2022<strong>Relevance (Topic):<\/strong>Measured by keyword matching (content).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Breakthrough:<\/strong>This pre-computed authority score provides a robust mechanism to combat content spam, as spammers cannot easily manipulate external links.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Markov Chains: The Basics<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Formal Model:<\/strong>The random surfer model is mathematically equivalent to a <strong>Markov Chain<\/strong>.<\/li>\n\n\n\n<li><strong>States:<\/strong>The states in the chain are the individual web pages.<\/li>\n\n\n\n<li><strong>Memoryless Property:<\/strong>The probability of moving to the next page depends <em>only<\/em>on the current page, not on the sequence of pages visited previously.<\/li>\n\n\n\n<li><strong>Transition Matrix (<\/strong><strong><em>M<\/em><\/strong><strong>):<\/strong>This matrix represents all links in the web graph.\n<ul class=\"wp-block-list\">\n<li>An entry<em>M<\/em><em>ij<\/em>is the probability of moving from page <em>j<\/em>to page <em>i<\/em>.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Transition Probabilities<\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"871\" height=\"484\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-12.png\" alt=\"\" class=\"wp-image-216\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-12.png 871w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-12-300x167.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-12-768x427.png 768w\" sizes=\"auto, (max-width: 871px) 100vw, 871px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Steady-State and Connection to PageRank<\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"878\" height=\"323\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-13.png\" alt=\"\" class=\"wp-image-217\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-13.png 878w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-13-300x110.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-13-768x283.png 768w\" sizes=\"auto, (max-width: 878px) 100vw, 878px\" \/><\/figure>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Iterative Computation (Power Iteration)<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"579\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-14-1024x579.png\" alt=\"\" class=\"wp-image-218\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-14-1024x579.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-14-300x170.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-14-768x434.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-14.png 1136w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Handling Dangling Nodes<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"489\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-15-1024x489.png\" alt=\"\" class=\"wp-image-220\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-15-1024x489.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-15-300x143.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-15-768x367.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-15.png 1492w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">The Damping Factor (<em>d<\/em>)<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"692\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-16-1024x692.png\" alt=\"\" class=\"wp-image-221\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-16-1024x692.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-16-300x203.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-16-768x519.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-16.png 1191w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Convergence and Stability<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Convergence Speed:<\/strong>The PageRank algorithm typically converges relatively quickly\u2014usually within 50 to 100 iterations\u2014to an acceptable tolerance level, even for the massive web graph.<\/li>\n\n\n\n<li><strong>Stability:<\/strong>The <em>d<\/em>factor ensures numerical stability. The PageRank scores change slowly over time because the underlying link structure of the web is relatively stable.<\/li>\n\n\n\n<li><strong>Update Frequency:<\/strong>Because the computation is massive, PageRank is not calculated in real-time. It&#8217;s recomputed periodically (e.g., weekly or monthly) on the latest snapshot of the web graph, and these pre-computed scores are used in the live ranking system.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Scalability and Large-Scale Computation<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Bottleneck:<\/strong>The primary challenge is not the complexity of the algorithm but the sheer <strong>size<\/strong>of the transition matrix (billions of edges).<\/li>\n\n\n\n<li><strong>Distributed Computing:<\/strong>PageRank calculation is performed using <strong>distributed processing frameworks<\/strong>like MapReduce(or similar modern big data platforms).<\/li>\n\n\n\n<li><strong>MapReduceImplementation:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Map Phase:<\/strong>Each page is processed by a mapper which &#8220;maps&#8221; its current PageRank score and distributes it to all pages it links to, along with the transition probability.<\/li>\n\n\n\n<li><strong>Reduce Phase:<\/strong>The reducer gathers all incoming rank contributions for a specific page <em>i<\/em>and sums them up to calculate the new <strong>PR<\/strong>(<em>i<\/em>) for the next iteration.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>This approach effectively parallelizes the large matrix-vector multiplication, making the computation feasible across thousands of servers.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Motivation: Why Context Matters<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Limitation of Global PageRank:<\/strong>Standard PageRank assigns a single &#8220;authority score&#8221; to every page. It assumes that if a page is important, it is universally important.<\/li>\n\n\n\n<li><strong>The Context Problem:<\/strong>Authority is inherently <strong>topic-dependent<\/strong>.\n<ul class=\"wp-block-list\">\n<li><em>Example:<\/em>A highly authoritative page about &#8220;Jaguar cars&#8221; might receive a high global PageRank, but it is irrelevant to a user researching &#8220;Jaguar&#8221; the animal.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>The Goal:<\/strong>To move from &#8220;Global Importance&#8221; to <strong>&#8220;Contextual Importance,&#8221;<\/strong>ensuring that ranking reflects authority <em>within a specific domain<\/em> of interest rather than just general popularity across the entire web.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Topic-Sensitive PageRank (TSPR)<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Mechanism: Biased Teleportation:<\/strong>\n<ul class=\"wp-block-list\">\n<li>In standard PageRank, the random surfer teleports to <em>any<\/em>page with equal probability (<em>1\/N<\/em>) when they get bored.<\/li>\n\n\n\n<li>In TSPR, we restrict the teleportation set to a specific <strong>Topic<\/strong>(e.g., &#8220;Sports&#8221; or &#8220;Cooking&#8221;).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>The Seed Set (<\/strong><strong><em>S<\/em><\/strong><strong>):<\/strong>A set of pages pre-classified as highly relevant to a given topic (e.g., based on directory listings).<\/li>\n\n\n\n<li><strong>The Calculation:<\/strong>When the random surfer &#8220;teleports,&#8221; they jump exclusively to one of these seed pages in set S. This biases the entire rank distribution towards pages that are topologically close to the topic seeds.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Personalized PageRank (PPR)<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>User-Specific Bias:<\/strong>PPR extends the TSPR idea from a topic to an <strong>individual user&#8217;s interests<\/strong>.<\/li>\n\n\n\n<li><strong>The Personalized Seed Set (<\/strong><strong><em>P<\/em><\/strong><strong>):<\/strong>The teleport set <em>P<\/em>is determined by the user&#8217;s history, profile, bookmarks, or recent searches.<\/li>\n\n\n\n<li><strong>The Mechanism:<\/strong>The random surfer is biased to jump back to pages the <em>specific user<\/em>has previously deemed important or relevant.<\/li>\n\n\n\n<li><strong>Result:<\/strong>This results in a unique PageRank vector calculated for each user or group, providing rankings tailored to their personal preferences and interests.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Applications in Search and Recommendation<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li>\uf0b7<strong>Focused Search:<\/strong>TSPR is used to pre-calculate multiple PageRank vectors (e.g., TSPR-Science, TSPR-Arts). When a user submits a query, the system classifies the <em>intent<\/em>and uses the corresponding TSPR vector to weight results, boosting relevant domain authorities.<\/li>\n\n\n\n<li>\uf0b7<strong>Recommendation Systems:<\/strong>PPR can be used to recommend new content. Pages that have high PPR scores relative to a user&#8217;s interest profile but low global PageRank are excellent candidates for discovery.<\/li>\n\n\n\n<li>\uf0b7<strong>Example:<\/strong>TSPR helps a search for &#8220;Mars&#8221; by prioritizing pages authoritative in &#8220;Astronomy&#8221; over those authoritative in &#8220;Candy Bars,&#8221; ensuring topical relevance is reinforced by link structure.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Hubs and Authorities (HITS Algorithm)<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Concept of Hubs and Authorities<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"348\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-17-1024x348.png\" alt=\"\" class=\"wp-image-223\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-17-1024x348.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-17-300x102.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-17-768x261.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-17.png 1478w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">The Intuition: Mutually Recursive Relationship<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Core Philosophy:<\/strong>The definitions of Hubs and Authorities are circular and mutually reinforcing:\n<ul class=\"wp-block-list\">\n<li>A <strong>Good Hub<\/strong>is a page that points to many <strong>Good Authorities<\/strong>.<\/li>\n\n\n\n<li>A <strong>Good Authority<\/strong>is a page that is pointed to by many <strong>Good Hubs<\/strong>.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Recursive Quality:<\/strong>You cannot calculate one without the other. The algorithm exploits this relationship to identify high-quality communities of pages relevant to a specific query.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">HITS Algorithm Step 1: Constructing the Graph<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Root Set:<\/strong>The algorithm starts by retrieving a small set of relevant pages (e.g., the top 200 results) from a standard text-based search engine for the user&#8217;s query.<\/li>\n\n\n\n<li><strong>Expansion to Base Set:<\/strong>To capture the link structure, the Root Set is expanded to form the <strong>Base Set<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Add any page that <em>links to<\/em>a page in the Root Set.<\/li>\n\n\n\n<li>Add any page that is <em>linked to by<\/em>a page in the Root Set.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Goal:<\/strong>This creates a focused subgraph of the web that contains the most relevant communities for that specific query.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">HITS Algorithm Step 2: Iterative Updates<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"629\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-18-1024x629.png\" alt=\"\" class=\"wp-image-224\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-18-1024x629.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-18-300x184.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-18-768x472.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-18-1536x943.png 1536w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-18.png 1545w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Normalization step HITS algorithm (Part step 2 Previous slide): Controlling Score Growth<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Problem:<\/strong>The iterative update process (where Hub scores sum Authorities, and Authority scores sum Hubs) causes the scores to increase rapidly with every step. If left unchecked, all scores would quickly approach infinity, losing their comparative meaning.<\/li>\n\n\n\n<li><strong>The Solution:<\/strong>After every Authority Update step and every Hub Update step, the scores must be <strong>normalized<\/strong>. This ensures the resulting scores are scaled back into a comparable range, usually between 0 and 1.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Normalization step HITS algorithm: Controlling Score Growth<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"601\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-19-1024x601.png\" alt=\"\" class=\"wp-image-225\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-19-1024x601.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-19-300x176.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-19-768x451.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-19.png 1330w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Hubs and Authorities (HITS Algorithm)\u2014Choosing the Subset of the Web<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Query-Dependent Subgraph Selection<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"449\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-20-1024x449.png\" alt=\"\" class=\"wp-image-227\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-20-1024x449.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-20-300x132.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-20-768x337.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-20.png 1493w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Practical Considerations and Limitations<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Computational Cost:<\/strong>Because the graph is query-specific, the HITS algorithm must be run in <strong>real-time<\/strong>after the user submits a query. This makes it significantly slower and more computationally expensive than pre-computed algorithms like PageRank.<\/li>\n\n\n\n<li><strong>Topic Drift:<\/strong>A major limitation is &#8220;Topic Drift.&#8221; If the Root Set contains a few pages from a very tightly-knit but irrelevant community (e.g., a general query about &#8220;Jaguars&#8221; drifting entirely into &#8220;Jaguar Cars&#8221; due to strong linking among car sites), the algorithm can converge on the wrong topic, ignoring the user&#8217;s broader intent.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Vulnerabilities: Link Spam and SEO<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"387\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-21-1024x387.png\" alt=\"\" class=\"wp-image-228\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-21-1024x387.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-21-300x113.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-21-768x290.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-21.png 1477w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Advanced Link-Based Models<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">TrustRank: Combatting Web Spam<\/h3>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"335\" src=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-23-1024x335.png\" alt=\"\" class=\"wp-image-231\" srcset=\"http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-23-1024x335.png 1024w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-23-300x98.png 300w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-23-768x251.png 768w, http:\/\/ijeesoo.com\/wp-content\/uploads\/2025\/11\/image-23.png 1478w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">TrustRank: Combatting Web Spam<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Challenge:<\/strong>&#8220;Link Farms&#8221; and spam pages artificially inflate PageRank, making standard link analysis unreliable for quality control.<\/li>\n\n\n\n<li><strong>The Semi-Supervised Approach:<\/strong>TrustRankassumes that good pages rarely link to bad pages.\n<ul class=\"wp-block-list\">\n<li><strong>Oracle Set:<\/strong>Start with a small, manually verified set of trustworthy &#8220;seed&#8221; pages (e.g., universities, government sites).<\/li>\n\n\n\n<li><strong>Trust Propagation:<\/strong>Propagate trust scores outward from these seeds using a biased PageRank calculation.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Trust Attenuation:<\/strong>Trust dampens as it moves further from the seed set. A page 1 click away from a seed is highly trusted; a page 5 clicks away is less so.<\/li>\n\n\n\n<li><strong>Spam Mass:<\/strong>By comparing a page&#8217;s standard PageRank with its TrustRank, we can estimate its &#8220;Spam Mass&#8221;\u2014the portion of its rank likely derived from manipulation.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">TrustRank\u2014The Algorithm<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Algorithm:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Seed Selection:<\/strong>Identify a small, high-quality set of trusted &#8220;Seed Pages&#8221; (e.g., .gov, .edusites) that are manually verified as non-spam.<\/li>\n\n\n\n<li><strong>Biased PageRank:<\/strong>Run a modified PageRank calculation where the random surfer teleports <em>only<\/em>to these trusted seed pages (similar to Topic-Sensitive PageRank).<\/li>\n\n\n\n<li><strong>Trust Attenuation:<\/strong>Trust scores decrease as you move further away from the seed set. Pages with high TrustRankscores are legitimate; those with low scores are potential spam.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Application:<\/strong>Used primarily to filter out spam from search results and demote pages involved in manipulative linking schemes.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">SALSA (Stochastic Approach for Link-Structure Analysis)<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Problem with HITS:<\/strong>The original HITS algorithm is vulnerable to the &#8220;Tightly Knit Community&#8221; (TKC) effect, where a small group of mutually linking pages (even if irrelevant) can dominate the results.<\/li>\n\n\n\n<li><strong>The SALSA Solution:<\/strong>SALSA modifies HITS by casting the problem as a <strong>Random Walk<\/strong>on a bipartite graph of hubs and authorities.<\/li>\n\n\n\n<li><strong>Mechanism:<\/strong>Instead of summing all scores (which causes explosion), SALSA normalizes the link weights based on the number of links.\n<ul class=\"wp-block-list\">\n<li>It divides the authority weight by the number of in-links and hub weight by the number of out-links.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Result:<\/strong>This stochastic approach is significantly more stable and less prone to manipulation or topic drift than the original HITS algorithm.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Integration with Content-Based Signals<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Authority vs. Relevance:<\/strong>Link analysis measures <em>static<\/em>global authority, but it doesn&#8217;t inherently measure <em>relevance<\/em>to a specific user query.<\/li>\n\n\n\n<li><strong>The Hybrid Signal:<\/strong>Modern search engines do not rank by PageRank alone.\n<ul class=\"wp-block-list\">\n<li><strong>Relevance Score:<\/strong>Computed using content models (TF-IDF, BM25, BERT) based on query term matching.<\/li>\n\n\n\n<li><strong>Quality Score:<\/strong>Computed using link models (PageRank, TrustRank).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Ranking Fusion:<\/strong>The final rank is a function (often determined by Machine Learning \/ Learning-to-Rank models) that combines these signals. A page needs <strong>both<\/strong>high relevance and high authority to rank well.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Anchor Text Weighting and Hybrid Approaches<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Weighted Links:<\/strong>Not all links are created equal. Advanced models assign different weights to edges based on the <strong>Anchor Text<\/strong>.\n<ul class=\"wp-block-list\">\n<li>A link with anchor text &#8220;click here&#8221; carries less topical weight than a link with anchor text &#8220;neuroscience research paper.&#8221;<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Structural Weighting:<\/strong>Links are also weighted by their position:\n<ul class=\"wp-block-list\">\n<li><strong>Editorial Links:<\/strong>Links inside the main body text are trusted more (higher weight).<\/li>\n\n\n\n<li><strong>Navigational Links:<\/strong>Links in footers or sidebars are trusted less (lower weight).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Hybrid Graph Models:<\/strong>Newer approaches construct heterogeneous graphs that include not just pages, but users, queries, and clicks as nodes, allowing link analysis to propagate through user behavior data as well as web structure.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Link Analysis Beyond Web Search<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Citation Analysis in Academic Publishing<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Origin of Link Analysis:<\/strong>Web link analysis was heavily inspired by bibliometrics\u2014the statistical analysis of academic citations.<\/li>\n\n\n\n<li><strong>Direct Parallels:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Paper = Node:<\/strong>Each academic paper is a node in the citation graph.<\/li>\n\n\n\n<li><strong>Citation = Link:<\/strong>A citation from Paper A to Paper B is a directed edge, signaling &#8220;credit&#8221; or &#8220;prior art.&#8221;<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Metrics:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Impact Factor:<\/strong>Similar to in-degree centrality, measuring the average number of citations received by articles in a journal.<\/li>\n\n\n\n<li><strong>h-index:<\/strong>A metric for author-level authority, balancing productivity (number of papers) and citation impact (number of links).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Bibliographic Coupling &amp; Co-citation:<\/strong>These concepts (papers citing the same work, or being cited together) are the direct ancestors of the HITS hub\/authority model.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Social Network Analysis (SNA) Parallels<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Mapping the Social Graph:<\/strong>Link analysis algorithms are directly applicable to social networks (Twitter, Facebook, LinkedIn).\n<ul class=\"wp-block-list\">\n<li><strong>Follows\/Friending = Links:<\/strong>These edges define the structure of the community.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Centrality Measures:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Degree Centrality:<\/strong>Users with the most followers (Celebrities\/Hubs).<\/li>\n\n\n\n<li><strong>PageRank:<\/strong>Users followed by <em>other influential users<\/em>(Thought Leaders).<\/li>\n\n\n\n<li><strong>BetweennessCentrality:<\/strong>Users who act as bridges or &#8220;gatekeepers&#8221; between different social groups or communities.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Application:<\/strong>Identifying key influencers for marketing, detecting communities (clustering), and predicting information spread (virality).<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Knowledge Graphs and Entity Linking<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>From Documents to Things:<\/strong>Modern search analyzes a graph of <em>entities<\/em>(people, places, concepts), not just documents.\n<ul class=\"wp-block-list\">\n<li><strong>Nodes:<\/strong>Real-world entities (e.g., &#8220;Albert Einstein&#8221;).<\/li>\n\n\n\n<li><strong>Edges:<\/strong>Relationships (e.g., &#8220;born in,&#8221; &#8220;won prize&#8221;).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Link Analysis for Disambiguation:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Entity Linking:<\/strong>When a page mentions &#8220;Jaguar,&#8221; the system looks at the link structure of the surrounding text and entities to decide if it maps to the <em>Animal<\/em>node or the <em>Car<\/em>node in the Knowledge Graph.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Relation Strength:<\/strong>Algorithms similar to PageRank are used to score the &#8220;confidence&#8221; of a fact or relationship within the graph, filtering out noise.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Modern Trends in Link Analysis<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Graph Neural Networks (GNNs)<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Modern Evolution:<\/strong>Traditional PageRank calculates a single scalar score. <strong>Graph Neural Networks (GNNs)<\/strong>learn a high-dimensional <em>vector representation<\/em>(embedding) for every node.<\/li>\n\n\n\n<li><strong>Mechanism:<\/strong>\n<ul class=\"wp-block-list\">\n<li>GNNs (like GraphSAGEor GCN) aggregate information from a node&#8217;s neighbors.<\/li>\n\n\n\n<li>They learn to encode both the <strong>graph structure<\/strong>(who links to whom) and the <strong>node content<\/strong>(text on the page) simultaneously.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Advantage:<\/strong>Unlike static PageRank, GNNs can generalize to new, unseen nodes (inductive learning) and capture complex, non-linear structural patterns that simple random walks miss.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Hybrid Approaches: Combining Signals<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The &#8220;Learning to Rank&#8221; Framework:<\/strong>Modern search engines do not rely on a single algorithm. They use Machine Learning models (e.g., Gradient Boosted Decision Trees) that take hundreds of features as input.<\/li>\n\n\n\n<li><strong>The Input Features:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Link Signals:<\/strong>PageRank, HITS Authority, Domain Trust.<\/li>\n\n\n\n<li><strong>Content Signals:<\/strong>Keyword matching, semantic similarity (BERT).<\/li>\n\n\n\n<li><strong>User Behavior:<\/strong> Click-through rates (CTR), dwell time, pogo-sticking.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>The Hybrid Model:<\/strong>The ML model learns the optimal <em>weight<\/em>for each signal. For example, it might learn that Link Authority matters more for medical queries (&#8220;symptoms of flu&#8221;) but User Behavior matters more for trending news.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Ethical Considerations in Ranking<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The &#8220;Rich Get Richer&#8221; Effect:<\/strong>Link analysis creates a feedback loop. Top-ranked pages get more traffic, which leads to more people citing\/linking to them, which further cements their top ranking.\n<ul class=\"wp-block-list\">\n<li><strong>Consequence:<\/strong>It becomes very difficult for new, high-quality content to break into the top results.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Bias Amplification:<\/strong>If the link structure reflects societal biases (e.g., linking predominantly to male authors in science), the algorithm will amplify this bias, presenting it as objective &#8220;authority.&#8221;<\/li>\n\n\n\n<li><strong>Echo Chambers:<\/strong>Personalization algorithms (based on Personal PageRank) can trap users in filter bubbles, showing them only content that reinforces their existing worldview.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-container-core-group-is-layout-19e250f3 wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-6329a8f3 wp-block-group-is-layout-flex\">\n<h2 class=\"wp-block-heading has-text-align-center is-style-asterisk\">Evaluation and Critiques<\/h2>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Strengths and Limitations of Link Analysis<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Strengths:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Hard to Fake:<\/strong>Unlike text (which anyone can edit), you cannot easily force authoritative sites to link to you.<\/li>\n\n\n\n<li><strong>Global Wisdom:<\/strong>It crowdsources quality judgments from millions of webmasters.<\/li>\n\n\n\n<li><strong>Pre-computable:<\/strong>Scores can be calculated offline (batch processing), enabling fast query-time retrieval.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Limitations:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Lag:<\/strong>It takes time for new links to be crawled and indexed.<\/li>\n\n\n\n<li><strong>Sparsity:<\/strong>Many high-quality pages (especially new ones) have few or no links (&#8220;The Cold Start Problem&#8221;).<\/li>\n\n\n\n<li><strong>Topic Drift:<\/strong>Pure link analysis can sometimes drift away from the user&#8217;s\u00a0<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Vulnerabilities to Manipulation<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>Link Spam:<\/strong>The commercial value of high rankings led to a &#8220;Black Hat SEO&#8221; industry focused on gaming link algorithms.<\/li>\n\n\n\n<li><strong>Common Attacks:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Link Farms:<\/strong>Networks of sites created solely to link to each other.<\/li>\n\n\n\n<li><strong>Paid Links:<\/strong>Buying links from high-authority sites without editorial oversight.<\/li>\n\n\n\n<li><strong>Comment Spam:<\/strong>Bots posting links in blog comments or forums.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>The Countermeasures:<\/strong>Search engines developed sophisticated countermeasures (like Google&#8217;s Penguin update and TrustRank) to detect unnatural linking patterns and penalize the offenders.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--40)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns alignwide is-layout-flex wp-container-core-columns-is-layout-d1c656ed wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:40%\">\n<h3 class=\"wp-block-heading is-style-asterisk\">Balancing Link Analysis with Modern ML<\/h3>\n\n\n\n<ul style=\"line-height:1.75\" class=\"wp-block-list is-style-checkmark-list\">\n<li><strong>The Shift:<\/strong>While links remain a strong signal for <em>authority<\/em>, modern ML (specifically Large Language Models and Transformers) has become superior for understanding <em>relevance<\/em> and <em>intent<\/em>.<\/li>\n\n\n\n<li><strong>Current State:<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Content First:<\/strong>Deep learning models (like BERT) ensure the document actually answers the user&#8217;s question.<\/li>\n\n\n\n<li><strong>Links as Validator:<\/strong>Link analysis acts as a &#8220;trust filter&#8221; or tie-breaker. If two pages have equally good answers, the one with better link authority wins.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Diminishing Returns:<\/strong>As content understanding improves, the reliance on raw link counts is slowly decreasing, though it remains a fundamental pillar of web search.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignfull has-global-padding is-layout-constrained wp-container-core-group-is-layout-d89aad35 wp-block-group-is-layout-constrained\" style=\"margin-top:0;margin-bottom:0;padding-top:var(--wp--preset--spacing--50);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--50);padding-left:var(--wp--preset--spacing--50)\">\n<h2 class=\"wp-block-heading alignwide has-x-large-font-size\" style=\"margin-top:0;margin-bottom:var(--wp--preset--spacing--40);line-height:1\">Watch, Read, Listen<\/h2>\n\n\n\n<div class=\"wp-block-group alignwide has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-query alignwide is-layout-flow wp-block-query-is-layout-flow\">\n\n\n<div style=\"height:var(--wp--preset--spacing--30)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n\n<div class=\"wp-block-query-no-results\">\n\n<p>No posts were found.<\/p>\n\n<\/div><\/div>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-group alignfull has-global-padding is-layout-constrained wp-container-core-group-is-layout-d89aad35 wp-block-group-is-layout-constrained\" style=\"margin-top:0;margin-bottom:0;padding-top:var(--wp--preset--spacing--50);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--50);padding-left:var(--wp--preset--spacing--50)\">\n<div class=\"wp-block-group alignwide has-base-2-background-color has-background has-global-padding is-layout-constrained wp-container-core-group-is-layout-39412042 wp-block-group-is-layout-constrained\" style=\"border-radius:16px;padding-top:var(--wp--preset--spacing--40);padding-right:var(--wp--preset--spacing--50);padding-bottom:var(--wp--preset--spacing--40);padding-left:var(--wp--preset--spacing--50)\">\n<div style=\"height:var(--wp--preset--spacing--10)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading has-text-align-center has-x-large-font-size\">Join 900+ subscribers<\/h2>\n\n\n\n<p class=\"has-text-align-center\">Stay in the loop with everything you need to know.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\">Sign up<\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:var(--wp--preset--spacing--10)\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Why Links Matter Historical Context: A Breakthrough The Web as a Graph Introduction to Link Analysis The Web as a Graph 7 PageRank 18 Hubs and Authorities (HITS Algorithm) 6 Hubs and Authorities (HITS Algorithm)\u2014Choosing the Subset of the Web 3 Advanced Link-Based Models 6 Link Analysis Beyond Web Search 3 Modern Trends in Link [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":113,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-192","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/pages\/192","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/ijeesoo.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=192"}],"version-history":[{"count":28,"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/pages\/192\/revisions"}],"predecessor-version":[{"id":236,"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/pages\/192\/revisions\/236"}],"up":[{"embeddable":true,"href":"http:\/\/ijeesoo.com\/index.php?rest_route=\/wp\/v2\/pages\/113"}],"wp:attachment":[{"href":"http:\/\/ijeesoo.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=192"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}