POJ 1007 DNA Sorting
一、题目信息
DNA Sorting
Time Limit: 1000MS Memory Limit: 10000K
Total Submissions: 68122 Accepted: 27076
Description
One measure of ``unsortedness'' in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence ``DAABEC'', this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence ``AACEDGG'' has only one inversion (E and D)---it is nearly sorted---while the sequence ``ZWQM'' has 6 inversions (it is as unsorted as can be---exactly the reverse of sorted).
You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length.
Input
The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (0 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.
Output
Output the list of input strings, arranged from ``most sorted'' to ``least sorted''. Since two strings can be equally sorted, then output them according to the orginal order.
Sample Input
10 6
AACATGAAGG
TTTTGGCCAA
TTTGGCCAAA
GATCAGATTT
CCCGGGGGGA
ATCGATGCAT
Sample Output
CCCGGGGGGA
AACATGAAGG
GATCAGATTT
ATCGATGCAT
TTTTGGCCAA
TTTGGCCAAA
二、算法分析
1.枚举法
本题实际就是计算一个序列的逆序数。关于逆序数的概念请自己查阅。最直接的方法就是对于序列中的每个元素,找在这个元素之后,并比这个元素小的元素数,我们暂且称这个数为一个元素的的逆序数。一个序列的所有元素逆序数的宗和,就是一个序列的逆序数。
枚举法的求一个序列的逆序数时间复杂度是O(n^2),再加上对序列排序,时间复杂度最低也是O(n^2*mlogm)。
另外注意的是,由于题目要求Since two strings can be equally sorted, then output them according to the orginal order.也即是,两个逆序数相同的序列,输出的顺序应该跟输入的顺序是一样的。这就要求对序列的排序,应该采用稳定排序。用C++STL中的算法stable_sort就可以了。
2.归并排序
就单纯求逆序数来说,用归并排序是一个较好的方法,在归并排序的过程中,加入记录元素交换次数代码就可以求出逆序数。时间复杂度能降到O(nlogn)。总的时间复杂度能降到O(nlogn * mlogm)。
三、参考代码
1.枚举法C++代码
[cpp]
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <windows.h>
using namespace std;
size_t countUnsortedness(const string &s)
{
int sum = 0;
for(int i = 0;i < s.size();i++)
for(int j = i+1; j <s.size(); j++)
if (s[j] < s[i] )
sum++;
return sum;
}
bool comp(const string &s1,const string &s2)
{
return countUnsortedness(s1) < countUnsortedness(s2) ? true : false;
}
int main()
{
vector< string > DNAs;
int n,m;
cin >> n >> m;
for(string s;m--;cin>>s,DNAs.push_back(s));
stable_sort(DNAs.begin(),DNAs.end(),comp);
for(int i = 0; i < DNAs.size(); i++)
cout << DNAs[i] << endl;
system("pause");
return 0;
}
2.归并排序C++代码
补充:软件开发 , C++ ,