Inferring human values for safe AGI design

Publication:
Inferring human values for safe AGI design

dc.contributor.author	Sezener, Can Eren
dc.contributor.ozugradstudent	Sezener, Can Eren
dc.date.accessioned	2016-06-30T12:33:30Z
dc.date.available	2016-06-30T12:33:30Z
dc.date.issued	2015
dc.description.abstract	Aligning goals of superintelligent machines with human values is one of the ways to pursue safety in AGI systems. To achieve this, it is first necessary to learn what human values are. However, human values are incredibly complex and cannot easily be formalized by hand. In this work, we propose a general framework to estimate the values of a human given its behavior.
dc.identifier.doi	10.1007/978-3-319-21365-1_16
dc.identifier.endpage	155
dc.identifier.isbn	978-3-319-21365-1
dc.identifier.scopus	2-s2.0-84952760405
dc.identifier.startpage	152
dc.identifier.uri	http://hdl.handle.net/10679/4184
dc.identifier.uri	https://doi.org/10.1007/978-3-319-21365-1_16
dc.identifier.wos	000363479400016
dc.language.iso	eng	en_US
dc.peerreviewed	yes
dc.publicationstatus	published	en_US
dc.publisher	Springer International Publishing
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.publicationcategory	International
dc.rights	restrictedAccess
dc.subject.keywords	Value learning
dc.subject.keywords	Inverse reinforcement learning
dc.subject.keywords	Friendly AI
dc.subject.keywords	Safe AGI
dc.title	Inferring human values for safe AGI design	en_US
dc.type	bookPart	en_US
dc.type.subtype	Book chapter
dspace.entity.type	Publication