Editorial for Problem J

Megalopolis

Author : nagajahnavidann1

Required Knowledge : Greedy, Isotonic merging

Time Complexity : O(N)\mathcal{O}(N)

Editorialist : nagajahnavidann1

Approach:

The first part of the question - finding the values of the functions chaos score and minimum representation score is straight forward:

1.Chaos Score (CC): For any given string, the chaos score can be found by directly comparing the corresponding positions of the characters in the given string and the sorted string.

2.Minimum Representation Score (MM): This can be obtained by maintaining a frequency array to find the minimum repeated character in the string.

Moving into the second part: Merging.

We have to merge in such a way to obtain the chaos scores of merged groups to be lexicographically smallest.

If there are 2 megalopolises of different disorder metrics, merging the second with the first is beneficial (i.e., it would produce a lexicographically smaller value than without merging) if the second disorder metric is lesser than the first.

Proof:

Find the chaos function and min rep function prefix sums:

  • pci=c0+c1++cipc_i = c_0 + c_1 + \dots + c_i

  • pmi=m0+m1++mipm_i = m_0 + m_1 + \dots + m_i

The disorder metric for a range ii to jj is given by: pcipcjpmipmj\frac{pc_i - pc_j}{pm_i - pm_j}

This is the slope of the plot between pcpc values and pmpm values from point ii to jj. The points in the plot are monotonically increasing, so the lines joining the lower convex points provide the minimum slope i.e., the lexicographically minimum most disorder metric that can be obtained.

Simplified Mathematical Proof:

Let the first megalopolis have scores (a,b)(a, b) and the second have (c,d)(c, d), where the metrics are D1=abD_1 = \frac{a}{b} and D2=cdD_2 = \frac{c}{d}. If D1>D2D_1 > D_2, then by the property of mediants: cd<a+cb+d<ab\frac{c}{d} < \frac{a+c}{b+d} < \frac{a}{b}

Thus, it is always greedy to merge 2 megalopolises if the second one has a lower disorder metric than the first.

This can be simply implemented in a monotonic stack (maintaining increasing values of the disorder metric).

Handling Equal Values:

In case of the same value of disorder metric for 2 megalopolises, the lexicographically minimal arrangement would be to not merge them.

Example: 1,2,2,3,41, 2, 2, 3, 4 is smaller than 1,2,3,41, 2, 3, 4.

However, in the case of equal scores at the end of the array, merging the megalopolises gives the solution.

Example: 1,2,3,41, 2, 3, 4 is smaller than 1,2,3,4,4,41, 2, 3, 4, 4, 4.

Setter's Code:

//Author: nagajahnavidann1 >_<
#include <bits/stdc++.h>
using namespace std;

#define fastio ios::sync_with_stdio(false); cin.tie(nullptr)
#define pb push_back
#define full(v) v.begin() ,v.end()
typedef bool bl;
typedef long long int ll;
template<class T>void inp(vector<T>&a){for(auto &x:a)cin>>x;}

//---------------------solve------------------
void dj()
{ 
    ll n;
    cin>>n;
    vector<string>a(n);
    inp(a);
    vector<ll>c(n),m(n,LLONG_MAX);
    //precomputation
    for(int j=0;j<n;j++){
        vector<ll>fr(26,0);
        for(char ch:a[j]){
            fr[ch-'a']++;
        }
        for(int i=0;i<26;i++){
            if(fr[i]!=0)
            m[j]=min(m[j],fr[i]);
        }
        string t=a[j];
        sort(full(t));
        for(int i=0;i<a[j].size();i++){
            if(a[j][i]!=t[i])
            c[j]++;
        }
    }

    //merging
    stack<pair<ll,ll>> st;
    for(int i=0;i<n;i++){
        ll num=c[i],den=m[i];

        while(!st.empty()){
            auto [pnum,pden]=st.top();
            if(pnum*(den) > pden*(num)){
                st.pop();
                num+=pnum;
                den+=pden;
            }else break;
        }

        st.push({num,den});
    }
    ll num=st.top().first,den=st.top().second;
    while(!st.empty() && st.top().first*den==num*st.top().second)
    st.pop();
    cout<<st.size()+1<<endl;
    vector<double> ans;
    ans.pb((double)num/(double)den);
    while(!st.empty()){
        // cout<<st.top().first<<" "<<st.top().second;
        ans.pb((double)st.top().first/(double)st.top().second);
        st.pop();
    }
    for(int i=ans.size()-1;i>=0;i--)
    cout<<fixed<<setprecision(6)<<ans[i]<<" ";
    cout<<endl;
}   
 
// ------------------------------------------------------------------------------------
 
signed main(){
	fastio;
	int t;cin>>t;while(t--)
	dj();
}

Tester's Code:


from sys import stdin, stdout
from math import inf
from collections import deque

def UNsolve():
    input = stdin.readline
    n = int(input())
    a = [input().strip() for _ in range(n)]
    score = [0] * n
    cnt = [0] * n

    for k, s in enumerate(a):
        t = sorted(s)
        f = [0] * 26
        for i, ch in enumerate(s):
            if t[i] != ch:
                score[k] += 1
            f[ord(ch) - ord('a')] += 1
        mn = inf
        for x in f:
            if x != 0:
                mn = min(mn, x)
        cnt[k] = mn

    st = []
    for i in range(n):
        st.append([score[i], cnt[i]])
        while len(st) >= 2:
            curr, cntt = st.pop()
            psum, pcnt = st[-1]
            currsum = psum + curr
            currcnt = pcnt + cntt
            if currsum * pcnt < psum * currcnt:
                st[-1][0] = currsum
                st[-1][1] = currcnt
            else:
                st.append([curr, cntt])
                break

    ans = []
    while st:
        sum_, cntt = st.pop()
        ans.append(sum_ / cntt)
    ans.reverse()

    while len(ans) >= 2 and ans[-1] == ans[-2]:
        ans.pop()

    stdout.write(str(len(ans)) + "\n")
    stdout.write(" ".join(f"{x:.10f}" for x in ans) + "\n")


def main():
    input = stdin.readline
    test_cases = True
    t = 1
    if test_cases:
        t = int(input())
    for _ in range(t):
        UNsolve()

if __name__ == "__main__":
    main()