How can you tell if an MT system is `good'? How can you tell which of two systems is `better'? What do `good' and `better' mean in this context? These are the questions that this chapter tries to answer.
In a practical domain like MT, such questions reduce to questions of suitability to users' needs: what is the best and most economical way to deal with the user's translation requirements? In the ideal case, it should be possible to give a simple and straightforward answer to this question in a consumers' magazine. An article in such a magazine would discuss the most important issues with a comparison table displaying the achievements of different MT systems on tests of important aspects such as speed and quality . Unfortunately, the information necessary to make informed judgements is not so readily available, partly because the methods for investigating suitability are not well developed. In reality, MT users can spend quite a lot of money finding out what a system can and cannot do for them. In this chapter we will look at the kind of thing that should matter to potential users of MT systems, and then discuss some existing methods for assessing MT system performance.
As we pointed out in the Introduction (Chapter ), we think that, in the short term, MT is likely to be of most benefit to largish corporate organizations doing a lot of translation. So we adopt this perspective here. However, most of the considerations apply to any potential user.