In the last decade, some detailed analyses of peer-review schema applied to the evaluation of observing proposals showed that the outcome of these review processes were affected by significant biases, especially gender-related ones. Dual anonymous evaluations were identified as an important step to correct for the observed discrimination. Among the several improvements recently implemented in the front-end of ESO operational model (the so-called Phase1 segment, that includes the preparation, submission and evaluation of observing proposals), we have now been running dual anonymous reviews since a few years. Here, we present the first results of our analysis that compares observing proposal success rates as a function of the proposer’s gender and career status, before and after the introduction of dual anonymous reviews, and we evaluate the impact that this approach is having on the outcome of the process.
To test the potential disruptive effect of Artificial Intelligence (AI) transformers (e.g., ChatGPT) and their associated Large Language Models on the time allocation process, both in proposal reviewing and grading, an experiment has been set-up at ESO for the P112 Call for Proposals. The experiment aims at raising awareness in the ESO community and build valuable knowledge by identifying what future steps ESO and other observatories might need to take to stay up to date with current technologies. We present here the results of the experiment, which may further be used to inform decision-makers regarding the use of AI in the proposal review process. We find that the ChatGPT-adjusted proposals tend to receive lower grades compared to the original proposals. Moreover, ChatGPT 3.5 can generally not be trusted in providing correct scientific references, while the most recent version makes a better, but far from perfect, job. We also studied how ChatGPT deals with assessing proposals. It does an apparent remarkable job at providing a summary of ESO proposals, although it doesn’t do so good to identify weaknesses. When looking at how it evaluates proposals, however, it appears that ChatGPT systematically gives a higher mark than humans, and tends to prefer proposals written by itself.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.