Education that suits the individual learning level is necessary to improve students’ understanding. The first step in achieving this purpose by using large language models (LLMs) is to adjust the textual difficulty of the response to students. This work analyzes how LLMs can implicitly adjust text difficulty between user input and its generated text. To conduct the experiments, we created a new dataset from Stack-Overflow to explore the performance of question-answering-based conversation. Experimental results on the Stack-Overflow dataset and the TSCC dataset, including multi-turn conversation show that LLMs can implicitly handle text difficulty between user input and its generated response. We also observed that some LLMs can surpass humans in handling text difficulty and the importance of instruction-tuning.