Editorial for Problem D

Avengers Team Selection

Author: muditk_24

Required Knowledge : Basic Graphs, Implementation

Time Complexity : O((N+M)logN)\mathcal{O}((N + M) \cdot logN)

Editorialist : muditk_24

Approach:

After clearly understanding the problem, the solution depends on the implementation of the review rounds.

Problem Simplification: Undirected Graph

The problem states there are MM one-way support links ((a directed graph)). A hero's Connection Score is defined as:

Connection Score = number of heroes they rely on + number of heroes who rely on them

Connection Score = out-degree + in-degree

Here's a key insight, if we model the graph as undirected, a node's degree is the sum of its incoming and outgoing edges. Therefore, the Connection Score of a hero is identical to their degree in an equivalent undirected graph.

This simplification is powerful. When a node uu is removed, its edge (u,v)(u, v) is also removed. This single action correctly decrements the Connection Score ((degree)) of its neighbor vv by 11. We can now treat the problem as an undirected graph, which is simpler to implement.

Implementation 1: Priority Queue ((Optimized))

To efficiently simulate the rounds, we need to quickly find all nodes with a score strictly greater than the Selection Threshold ((the average score)).

We can use a set/maxheap (acting as a max-priority queue) to store pairs of score,hero{score, hero}. This allows us to find the max-score hero in O(logN)\mathcal{O}(logN) time and iterate from highest to lowest.

In each round, we calculate the average. We then iterate from the top of the set, removing all heroes whose score is greater than the average.

For each hero u we remove, we iterate through their neighbors vv. If vv is still in the set (i.e.,(i.e., not also being cut in this same round)), we must update its score. This involves removing vv's old entry, decrementing its score, and re-inserting it into the set.

We continue this process until there's no set of heroes/nodes to be removed from the graph.

Implementation 2: Brute Force ((Why it Works))

What if we don't use a priority queue? A brute-force simulation seems too slow.

In each round, iterate through all NN possible nodes to find the active ones, sum their scores, and calculate the average. Then, perform a second pass to find and "mark" all nodes to be cut. Finally, iterate through the marked nodes and update their neighbors' scores.

Complexity: Let RR be the total number of review rounds. In each round, we do O(N)\mathcal{O}(N) work to find the average and identify nodes to cut. The work of updating neighbor scores, when summed across all rounds, is O(M)\mathcal{O}(M) ((since each edge is removed only once)).

The total complexity for this approach is O(RN+M)\mathcal{O}(R \cdot N + M)

This solution would fail if RR were large, for example O(N)\mathcal{O}(N). However, the given constraints allow this solution to pass. The reason is that RR ((the number of rounds)) is guaranteed to be extremely small.

Proof: Why R is small

The graph decomposes very quickly. Let's analyze the worst-case scenario to maximize the number of rounds, RR.

To maximize RR, we want to remove as few nodes as possible per round, ideally just one.

Let's try to construct a graph with scores s1<=s2<=...<=sNs_1 <= s_2 <= ... <= s_N where only one node is removed each time.

Round 11: We need sN>avg(s1,...,sN)s_N > avg(s_1, ..., s_N) and sN1<=avg(s1,...,sN)s_{N-1} <= avg(s_1, ..., s_N).

Round 22: After removing sNs_N and its edges, we need the new max score to be just above the new average.

Your example sequence demonstrates this worst-case construction:

N=10:[1,2,2,3,5,7,11,16,24,36]N = 10 : [1, 2, 2, 3, 5, 7, 11, 16, 24, 36]
N=30:[...,80032,120048]N = 30 : [..., 80032, 120048]

This shows that to create a scenario that lasts NN rounds ((by removing one node at a time)), the required Connection Scores must grow exponentially with NN.

The total score S=sum(si)S = sum(s_i) also grows exponentially with NN.

Since the total number of links M=S/2,M = S/2, this means MM must grow exponentially relative to NN.

Flipping this, if we have MM links, the maximum NN that can participate in such a long-running sequence is N=N = O(logN)\mathcal{O}(logN).

The number of rounds RR is always less than or equal to NN ((since we must remove at least one hero per round for the process to continue)).

Therefore, the maximum number of rounds RR is bounded by O(logM)\mathcal{O}(logM).

Since RR is logarithmically small, the brute force complexity of O(RN+M)\mathcal{O}(R \cdot N + M) is highly efficient and well within the time limits.

Setter's Code:

#include <bits/stdc++.h>
using namespace std;
#define int long long

void solve(){
    int n, m; cin >> n >> m;
    vector<vector<int>> adj(n + 1);
    vector<int> score(n + 1, 0);
    int tot = 0;
    // undirected for simplicity. 
    for(int i = 0; i < m; i++){
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
        score[u]++;
        score[v]++;
    }
    int size = n;
    int ans = 0;
    set<pair<int,int>, greater<pair<int,int>>> pq;
    for(int i = 1; i <= n; i++){
        tot += score[i];
        pq.insert({score[i], i});
    }
    while(1){
        if(pq.empty() || tot == 0) break;
        double avg = (1.0 * tot) / pq.size();
        auto x = *pq.begin();
        if(x.first * 1.00 > avg) ans++;
        else break;
        queue<int> nodes;
        int size = pq.size();
        // finding nodes which have a connection score strictly greater than average. 
        while(!pq.empty()){
            auto a = *pq.begin();
            if(a.first > avg){
                tot -= a.first;
                nodes.push(a.second);
                pq.erase({score[a.second], a.second});
                score[a.second]=0;
            }
            else break;
        }
        if(nodes.empty()) break;
        // updating the connection scores.
        while(!nodes.empty()){
            int src = nodes.front();
            nodes.pop();
            for(auto v : adj[src]){
                if(pq.find({score[v], v}) != pq.end()){
                    tot--;
                    pq.erase({score[v], v});
                    score[v]--;
                    pq.insert({score[v], v});
                }
            }
        }
    }
    cout << ans + 1 << endl; // + 1 for the round in which the process stops. 
}

signed main() {
    int t; cin >> t;
    while(t--){
      solve();
    }
    return 0;
}

Tester's Code:

#include <bits/stdc++.h>
using namespace std;
#define int long long

void solve(){
    int n, m; cin >> n >> m;
    vector<vector<int>> adj(n + 1);
    vector<vector<int>> radj(n + 1);
    for(int i = 0; i < m; ++i){
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
        radj[v].push_back(u);
    }
    vector<bool> active(n + 1, true);
    active[0] = false; 
    int rounds = 0;
 
    while(true){
        int curr = 0, tot = 0;
        vector<int> scores(n + 1, 0);
 
        for(int i = 1; i <= n; ++i){
            if(active[i]){
                curr++;
                int score = 0;
                for(int node : adj[i]){
                    if(active[node]) score++;
                }
                for(int node : radj[i]){
                    if(active[node]) score++;
                }
                scores[i] = score;
                tot += score;
            }
        }
        
        if(curr == 0) break;
 
        vector<int> remove;
        for(int i = 1; i <= n; ++i){
            if(active[i]){
                if(scores[i] * curr > tot){
                    remove.push_back(i);
                }
            }
        }
 
        if(remove.empty()) break;
 
        rounds++;
        for(int u : remove){
            active[u] = false;
        }
    }
 
    cout << rounds + 1 << endl; 
}

signed main() {
    int t; cin >> t;
    while(t--){
      solve();
    }
    return 0;
}

// Complexity: O(R * N + M)