Some of the lines of code in the actor-critic implementation in the reinforcement learning code examples has some bugs related to evolution if the libraries and some few tweaks needed.

Especially related to new versions of the environment being used. Issues arise during the fundamental actions being taken on the environment On the first run one runs into this error

      `ValueError: Exception encountered when calling Functional.call().

      Invalid input shape for input [-0.04058227]. Expected shape (None, 4), but input has incompatible shape (1,)

      Arguments received by Functional.call():
        • inputs=tf.Tensor(shape=(1,), dtype=float32)
        • training=None
        • mask=None`

well there many other small tweaks needed to make it run on the first shot

here are some of the snippets of code

`state = env.reset()[0]`

this specifically brings an issue with how python types are processed

`while True:  # Run until solved
    state = env.reset()[0]
    episode_reward = 0
    with tf.GradientTape() as tape:
        for timestep in range(1, max_steps_per_episode):

            state = ops.convert_to_tensor(state)
            state = ops.expand_dims(state, 0)`

This piece as a whole brings an error where the state cannot be converted to a tensor. These are just some of the issues and some few more

Comment From: lmntrx-sys

this is the url: https://keras.io/examples/rl/actor_critic_cartpole/

Comment From: sonali-kumari1

Hi @lmntrx-sys -

Thanks for reporting this issue. I have tested the actor-critic reinforcement learning code example in this gist and I was able to reproduce the ValueError you encountered. Would you be interested in contributing a fix for this?

Comment From: lmntrx-sys

Hi @sonali-kumari1 Sorry for the late reply but I would be happy to provide a fix, I already have a fix in a colab notebook in my account am just waiting for your go ahead

Comment From: lmntrx-sys

I have pushed a fix for the value error. Check it out, my repository lmntrx-sys/Research/Actor_Critic_method.ipynb