Sezener, Can Eren2016-06-302016-06-302015978-3-319-21365-1http://hdl.handle.net/10679/4184https://doi.org/10.1007/978-3-319-21365-1_16Aligning goals of superintelligent machines with human values is one of the ways to pursue safety in AGI systems. To achieve this, it is first necessary to learn what human values are. However, human values are incredibly complex and cannot easily be formalized by hand. In this work, we propose a general framework to estimate the values of a human given its behavior.engrestrictedAccessInferring human values for safe AGI designbookPart15215500036347940001610.1007/978-3-319-21365-1_16Value learningInverse reinforcement learningFriendly AISafe AGI2-s2.0-84952760405